⬡ API & SDK/2026-06-26Advanced

Generating localized App Store listing metadata within character limits using the Claude API

Raw translations overflow App Store Connect's character limits. Generate per-locale listing metadata within them using Claude API structured output and a repair loop.

api-sdk¹² tool-use²⁰ localization³ app-store³ structured-output⁴

✦ Premium Article

The moment you paste a localized subtitle into App Store Connect and it turns red with "exceeds 30 characters" is a spot I have hit on every multilingual release of my own indie apps. A catchy line that fits comfortably in English balloons to 1.5x in German, and in Japanese you can cram in so much that nobody can read it.

Before translation quality even enters the picture, this wall — different limits per field, different lengths per language — becomes the operational bottleneck. Here we will use the Claude API's Tool Use (structured output) to generate each locale's listing inside its limits, and automatically re-tighten anything that overflows.

Why pasting raw translations gets rejected

App Store Connect's localizable metadata has a fixed character limit per field. As of June 2026 the main limits are below (these can change, so always confirm against the current official values).

Field	Limit (chars)	Role
App name	30	Indexed for search. The most important field
Subtitle	30	Indexed for search. Supporting line in lists
Keywords	100	Comma-separated, 100 chars total. Hidden
Promotional text	170	Swappable without review
Description	4000	Not indexed for search

The tricky part is that these limits are counted in code points, not bytes. A Japanese full-width character counts as one character, just like a Latin letter. So Japanese 30 characters is the same slot as English 30 characters, but since each Japanese character carries more meaning, writing with an English mindset leaves it half-empty. Conversely, compounding-heavy languages like German or Finnish stretch the same meaning across more characters and blow past the limit easily.

The keyword field also has its own etiquette: comma-separated within 100 characters total, no spaces (a,b,c, not a, b, c), do not repeat words already used in the app name, and do not include both singular and plural forms. Break these and you waste the precious 100 characters. A translation engine knows none of these platform-specific constraints.

Treat the character limits as a schema

The first thing to do is pin the limits down as a single schema rather than scattered constants. Generation, validation, and repair all reference this one definition.

# app_store_limits.py
# App Store Connect localizable metadata and character limits (as of June 2026)
FIELD_LIMITS = {
    "name": 30,
    "subtitle": 30,
    "keywords": 100,
    "promotional_text": 170,
}

This FIELD_LIMITS feeds straight into the Tool Use input schema (maxLength). Declaring the limits in the schema doubles up the instruction to the model, and lets validation run against the exact same numbers when they are broken.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Move from getting rejected for over-limit translations to generating each locale's subtitle and keywords within the hard character limits

✦Combine Tool Use structured output with a re-validation and repair loop so the model's over-limit answers get compressed automatically

✦Adopt a keyword-field normalizer that spends the 100-character budget efficiently, plus a glossary shared via prompt caching to cut cost

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Force structured output with Tool Use

If you simply ask for "JSON, please" in free text, you get prose mixed in and field names that drift. With Tool Use and tool_choice, the model is forced to call your tool with input matching the schema, so the output is locked into a structure.

import anthropic
import json
 
client = anthropic.Anthropic()  # ANTHROPIC_API_KEY is read from the environment
 
SYSTEM_RULES = (
    "You localize App Store listings. Strictly observe each field's character limit. "
    "Limits are counted in code points (full-width counts as one). "
    "For keywords: comma-separated with no spaces, avoid words already in the app name, "
    "and do not include both singular and plural forms."
)
 
def build_tool(limits):
    return {
        "name": "emit_listing",
        "description": "Return localized App Store listing metadata",
        "input_schema": {
            "type": "object",
            "properties": {
                "name": {"type": "string", "maxLength": limits["name"]},
                "subtitle": {"type": "string", "maxLength": limits["subtitle"]},
                "keywords": {"type": "string", "maxLength": limits["keywords"]},
                "promotional_text": {"type": "string", "maxLength": limits["promotional_text"]},
            },
            "required": ["name", "subtitle", "keywords", "promotional_text"],
        },
    }
 
def generate_listing(locale, source, glossary, limits):
    tool = build_tool(limits)
    resp = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        tools=[tool],
        tool_choice={"type": "tool", "name": "emit_listing"},
        system=[
            {"type": "text", "text": SYSTEM_RULES},
            {"type": "text", "text": glossary, "cache_control": {"type": "ephemeral"}},
        ],
        messages=[{
            "role": "user",
            "content": f"Target locale: {locale}\nSource English listing:\n{json.dumps(source, ensure_ascii=False)}",
        }],
    )
    for block in resp.content:
        if block.type == "tool_use":
            return block.input
    raise RuntimeError("No structured output was returned")

The key is tool_choice={"type": "tool", "name": "emit_listing"}. The model is now obligated to call emit_listing, and block.input comes back as a plain dict.

Always re-validate — the model breaks the limit

This is the pitfall you will not see from the docs alone. Even with maxLength in the schema, the model often runs over by a few characters. It happens most with Japanese and German. In my own wallpaper apps' metadata, subtitles came back at 32 or 33 characters surprisingly often. You have to accept that the schema is a strong instruction, not an enforced validator.

So you must count it yourself after generation.

def count_chars(s):
    # App Store counts roughly in Unicode code points.
    # A Japanese full-width character counts as one (not bytes).
    return len(s)
 
def validate(listing, limits):
    problems = []
    for field, limit in limits.items():
        value = listing.get(field, "")
        n = count_chars(value)
        if n > limit:
            problems.append({"field": field, "count": n, "limit": limit, "over": n - limit})
    # Keyword-specific rule (spaces waste characters)
    kw = listing.get("keywords", "")
    if ", " in kw or " ," in kw:
        problems.append({"field": "keywords", "count": count_chars(kw), "limit": limits["keywords"], "over": 0})
    return problems

len() is strictly a code-point count, so emoji grapheme clusters (ZWJ sequences) could diverge from Apple's counting. But emoji are rarely used in store text fields, and in practice len() matches closely enough.

A repair loop that compresses only the over-limit fields

When validation finds an overflow, do not regenerate everything. Pass back only the over-limit fields and exactly how many characters to cut. Making the delta explicit helps the model shorten while preserving meaning.

def compress_fields(locale, listing, problems, limits):
    hint = "; ".join(
        f'{p["field"]} is {p["count"]} chars. Cut {p["over"]} more to fit the {p["limit"]} limit'
        for p in problems if p["over"] > 0
    )
    tool = build_tool(limits)
    resp = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=512,
        tools=[tool],
        tool_choice={"type": "tool", "name": "emit_listing"},
        system=[{"type": "text", "text": SYSTEM_RULES}],
        messages=[{
            "role": "user",
            "content": (
                f"Target locale: {locale}\nCurrent draft: {json.dumps(listing, ensure_ascii=False)}\n"
                f"Resolve these overflows (keep the meaning): {hint}"
            ),
        }],
    )
    for block in resp.content:
        if block.type == "tool_use":
            return block.input
    return listing
 
def generate_with_repair(locale, source, glossary, limits, max_retries=2):
    listing = generate_listing(locale, source, glossary, limits)
    for _ in range(max_retries):
        problems = validate(listing, limits)
        if not problems:
            return listing
        listing = compress_fields(locale, listing, problems, limits)
    return hard_truncate(listing, limits)  # last resort: trim mechanically
 
def hard_truncate(listing, limits):
    out = dict(listing)
    for field, limit in limits.items():
        if count_chars(out.get(field, "")) > limit:
            out[field] = out[field][:limit].rstrip(",, ")
    return out

Try repair twice; if it still does not fit, trim mechanically. Cutting mid-word breaks meaning, so it is safest to only drop trailing separators. In production, two loops brought 95%+ of fields within the limit. hard_truncate only fires for locales with extremely long compound words.

Spend the keyword field's 100 characters without waste

Keywords are hidden but feed directly into ranking — the highest-leverage ASO field. To not waste a single character of that narrow budget, run a normalizer after generation.

def normalize_keywords(raw, app_name, limit=100):
    seen, kept = set(), []
    banned = {w.lower() for w in app_name.replace(",", " ").split()}
    for term in raw.replace("、", ",").split(","):
        t = term.strip()
        key = t.lower()
        if not t or key in seen or key in banned:
            continue  # drop empties, duplicates, and words already in the app name
        candidate = ",".join(kept + [t])  # join without spaces and check length
        if count_chars(candidate) > limit:
            continue  # skip any term that would push us over
        seen.add(key)
        kept.append(t)
    return ",".join(kept)

The move that pays off here is dropping words already used in the app name. Apple indexes the app name, subtitle, and keywords together. Repeating a word in keywords spends your precious 100 characters on a term that will not be scored twice. Deduping and removing spaces alone freed up the equivalent of 10–15 characters in my own listings.

Keep brand and tone consistent across locales

The thing most likely to drift in multilingual rollouts is the consistency of brand names, product names, and tone — one locale translates the product name, another leaves it in English. To prevent this, gather a glossary (proper nouns to leave untranslated, tone guidance, banned phrasings) into one text and share it across every locale's generation.

Putting that glossary on the prompt cache with cache_control means that even when you send it once per locale, the glossary's input tokens are billed at cache rates from the second call on. Running 20 supported languages at once, the longer the glossary the bigger the win; in my case the overall input cost of generation dropped visibly. The prompt-cache design itself is covered in cutting your monthly API cost in half with prompt caching.

A note: keep your in-app string (Localizable.strings) translation as a separate pipeline from store listings. Listings are about character limits and ASO; UI strings are about terminology consistency and placeholder protection — different optimization axes. For the UI side, see batch-translating Localizable.strings with a glossary.

Pitfalls in real operation

Finally, a few traps I stepped in myself.

First, stuffing keywords into the subtitle gets rejected in review. When you focus only on fitting the character count, the subtitle tends to become a meaningless string of words, but review expects it to read as natural text a human would write. Spelling out "must read as a natural sentence" in the repair-loop prompt keeps you safe.

Second, understand locale fallback. If you ship only es-ES without es-MX, users in Mexico get Spain-targeted wording. You do not need to fill every locale, but you should know where your key markets fall back to.

Third, always have a human do the final check. Structured output and the repair loop guarantee the mechanical correctness of length and format, but whether the copy is appealing to a native speaker is a separate question. I lay out the results in a sheet and have native-speaker friends skim just the major languages. If you want to extend this structured-output validate-and-repair pattern to other uses, schema validation and repair loops for structured output is a good reference.

To start, drop your app's English listing into source and run generate_with_repair for a single locale. Just watching whether the subtitle fits on the first pass — or on which repair round it finally fits — tells you which languages your copy tends to balloon in.

Thank You for Reading

Claude Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.