Forem Creators and Builders 🌱

Discussion on: తెలుగు language

Collapse
 
djuber profile image
Daniel Uber • Edited

Maybe? It doesn't look to be automatic in the way the documentation suggests, and the choice of locales is limited by the backend (my machine is using an I18n::Backends::Simple backend and loading locale data from translations in gems):

I18n.locale
=> :en

# try the easiest thing that could possibly work, ask for Telugu as a locale:
 I18n.locale = :te
I18n::InvalidLocale: :te is not a valid locale
from /home/djuber/src/forem/vendor/bundle/ruby/3.0.0/gems/i18n-1.8.10/lib/i18n.rb:343:in `enforce_available_locales!'
Enter fullscreen mode Exit fullscreen mode

I don't see any Indian languages (on my machine, via I18n.available_locales) - however this is coming from the translations of installed gems, and I'm not certain whether and how the note about using the locale applies:

[33] pry(main)> I18n.locale=:ua
=> :ua                         

# English still works with the locale set:
[34] pry(main)> "Any thing you like.".parameterize
=> "any-thing-you-like"

# Ukranian doesn't work, even with Ukrainian locale:                                              
[36] pry(main)> "Пароль успішно встановлено. Ви успішно увійшли.".parameterize
=> ""     
Enter fullscreen mode Exit fullscreen mode

I think there's not a magic bullet here, but if you were to experiment with the slug format, you might try reimplementing paramterize without the removal of non-ascii characters, keeping the whitespace treatments, and CGI.escape the result to get a string like

"%E0%B0%A4%E0%B1%86%E0%B0%B2%E0%B1%81%E0%B0%97%E0%B1%81-language-790" which would display correctly as long as your browser had script support.

That might be outside of the short term extensions or have side effects I'm not aware of, but that may work if you tried changing the code in a local instance.

Thread Thread
 
9comindia profile image
9comindia

hi @djuber

from: github.com/sporkmonger/addressable

uri = Addressable::URI.parse("http://www.詹姆斯.com/")
uri.normalize
#=> #<Addressable::URI:0xc9a4c8 URI:http://www.xn--8ws00zhy3a.com/>
Enter fullscreen mode Exit fullscreen mode

Would the Addressable gem be relevant here, instead of calling the parameterize method for the title_to_slug..

Thread Thread
 
djuber profile image
Daniel Uber • Edited

I think Addressable is doing something very different than what you want?

Is xn--8ws00zhy3a here related at all to the input 詹姆斯, or more legible? It looks like this is a way to encode for transfer (as ascii), rather than to encode for human visibility. I might be misunderstanding it, particularly if there's some automatic display translation in the browser, and while it does preserve the information from the title it doesn't look like it does that in any human meaningful way.

If you wanted to try that - for title strings you probably just want the to_ascii method and not the URI.parse (you're not building a URI without making even larger changes)

Addressable::IDNA.to_ascii('తెలుగు language')
=> "xn-- language-zhy2gyguhb5e" 
Enter fullscreen mode Exit fullscreen mode

You could try it out, but I don't think this fits what you're asking about, a url slug that contains the post title. You would still need to handle stripping the spaces, looking at that output probably before applying punycode translation.

Thread Thread
 
9comindia profile image
9comindia • Edited

Is xn--8ws00zhy3a here related at all to the input 詹姆斯, or more legible?

Not just legible, If I click on the link xn--8ws00zhy3a.com , the link opened has translated to the Chinese characters in the url
chinese url

May not be the exact thing I was initially mentioning, but looking for something that could improve the SEO for non-english Forem instances.

Addressable::IDNA.to_ascii('తెలుగు language')
=> "xn-- language-zhy2gyguhb5e"

This is not resolving to the telugu characters in the url.
telugu url expected

Thread Thread
 
djuber profile image
Daniel Uber

It looks like the "sterile" library will do transliteration,

Transliterate article titles to slugs #15051

What type of PR is this? (check all applicable)

  • [ ] Refactor
  • [x] Feature
  • [ ] Bug Fix
  • [ ] Optimization
  • [ ] Documentation Update

Description

Article titles written in languages with non-Roman alphabets (Cyrillic, Greek, Hebrew, etc) have the entire title stripped from the slug.

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

  • Publish an article with a title that is completely in a non-Roman alphabet
  • The URL for the article should contain those characters transliterated to ASCII

UI accessibility concerns?

Ideally, nothing should change on this front.

Added/updated tests?

  • [x] Yes
  • [ ] No, and this is why: please replace this line with details on why tests have not been included
  • [ ] I need help with writing tests

[Forem core team only] How will this change be communicated?

Will this PR introduce a change that impacts Forem members or creators, the development process, or any of our internal teams? If so, please note how you will share this change with the people who need to know about it.

  • [ ] I've updated the Developer Docs or Storybook (for Crayons components)
  • [ ] This PR changes the Forem platform and our documentation needs to be updated. I have filled out the Changes Requested issue template so Community Success can help update the Admin Docs appropriately.
  • [ ] I've updated the README or added inline documentation
  • [ ] I've added an entry to CHANGELOG.md
  • [x] I will share this change in a Changelog or in a forem.dev post
  • [ ] I will share this change internally with the appropriate teams
  • [ ] I'm not sure how best to communicate this change and need help
  • [ ] This change does not need to be communicated, and this is why not: please replace this line with details on why this change doesn't need to be shared
</div>
<div class="gh-btn-container"><a class="gh-btn" href="https://github.com/forem/forem/pull/15051">View on GitHub</a></div>
Enter fullscreen mode Exit fullscreen mode


this PR might fix things for cases like yours.