Forem Creators and Builders 🌱

Christina Gorton for Forem Core Team

Posted on

Discuss: Non-ASCII title makes a weird URL on Forem/dev.to

Forem/dev.to has non-English speaking users, and they may write articles. They are expected to have an article which has a non-ASCII title. But, Forem/dev.to makes a weird URL with no slug if a title consists of non-ASCII characters.

You can read more about a proposed solution and 2 more related issues in the GitHub issue below.

Non-ASCII title makes a weird URL on Forem/dev.to #12350

Forem/dev.to has non-English speaking users, and they may write articles. They are expected to have an article which has a non-ASCII title. But, Forem/dev.to makes a weird URL with no slug if a title consists of non-ASCII characters.

The following article is a good example : https://dev.to/chokri/-eg9

(This issue is a reorganization of #6942 to describe a problem here, and to propose solutions in other issues #12351 and #12352.)

To Reproduce

  1. Start "Write a post" from dev.to
  2. Fill its title: with non-ASCII characters (e.g. γƒ†γ‚Ήγƒˆθ¨˜δΊ‹)
  3. Mark published: to true
  4. Save changes
  5. See the URL consists of only a random alphanumeric string (e.g. /user/-eg9)

Expected behavior

The article should have a meaningful URL with a meaningful slug.

Screenshots

The example article below (same as above) describes the issue well, instead of screenshots. https://dev.to/chokri/-eg9

Oldest comments (6)

Collapse
 
coffeecraftcode profile image
Christina Gorton

See also this issue:
github.com/forem/forem/issues/9425

Collapse
 
varhal profile image
Varhal

Hi! At what stage is the solution to this problem?

Collapse
 
kishanbsh profile image
Kishan B

Not only internationalization, Say if i do a typo in the title, share the url then i fix/refactor the title then the existing links become invalid.

Also the user id is exposed in the url

Eg dev.to/kishanbsh/capturing-custom-...

So if i change either my username or fix/refactor the title then the url is broken

Urls should be permanent for a post in my opinion

Collapse
 
kishanbsh profile image
Kishan B

Also short urls are cooler :-)

Collapse
 
yheuhtozr profile image
yheuhtozr • Edited on

I totally agree with the original approaches mentioned in the issue, namely:

  • Make article's slug configurable apart from its title
  • Accept article's slug in %-encoded Unicode characters

I think that we better have both.

For implementation details, some users might prefer their letters converted into Roman alphabet, where other dislike it (reliability of automatic alphabetic transliteration varies a lot across languages). So, while we should allow users to customize their slug, we can set an ASCII transcriber defaulted for limited languages, such as this. (Note: This sentence assumes that the author can designate the language of a post. Transliteration purely based on script without language information would be even less ideal.)

Collapse
 
djuber profile image
Daniel Uber

I think this is fixed (or less of an issue) by github.com/forem/forem/pull/15051 - we now use the sterile gem to transliterate to ascii (the example title from the original issue resutls in tesutoji-shi plus a random tail).