Quantcast
Channel: Active questions tagged copy-paste - TeX - LaTeX Stack Exchange
Viewing all articles
Browse latest Browse all 70

Getting correct searchable text for Devanagari text

$
0
0

Consider this fairly minimal document, which AFAIK is the recommended way of typesetting Devanagari-script Sanskrit-language text:

\documentclass{article}\usepackage{fontspec}\usepackage{polyglossia}\setmainlanguage{sanskrit}\newfontfamily\devanagarifont[Script=Devanagari]{Chandas}\begin{document}किंबहुना।परस्परंद्वैधम्उत्पन्नम्।\end{document}

When I typeset this, even when the output is visually fine, trying to copy the text from the PDF gives incorrect results each time. I've tried with both xelatex and lualatex, with four fonts all generously available online for free: Chandas, Noto Sans Devanagari, Noto Serif Devanagari, Adishila:

  • Correct text:

    • किंबहुना।परस्परंद्वैधम्उत्पन्नम्।
  • xelatex:

    • कंबहुना।परɕपरंजैधम्उɊपਯम्। (Chandas)
    • ɫकʌबहुना।परȺरंद्वैधम्उत्पȡम्। (Noto Sans Devanagari)
    • ȫकबहुना।परस्परंद्वैधम्उत्पन्नम्। (Noto Serif Devanagari)
    • िकंबहुना।परस्परंद्वैधम्उत्पन्नम्। (Adishila)
  • lualatex:

    • िकंबहुना।पर�परंद्वैधम्उ�पन्नम्। (Chandas)
    • िकंबहुना।परस्परंद्वैधम्उत्पन्नम्। (Noto Sans Devanagari — also, the output is broken)
    • िकंबहुना।परस्परंद्वैधम्उत्पन्नम्। (Noto Serif Devanagari — also, the output is broken)
    • िकंबzना।परस्परंद्वैधम्उत्पन्नम्। (Adishila)

So none of these are correct, though for some combinations, only the first syllable was problematic. (It doesn't matter that it's the first syllable; किं anywhere has the same issue.)

(Aside: This was using TeX Live 2020 so lualatex uses LuaHBTeX… yet the output is incorrect compared to xelatex for two of the fonts.)

Is there a way of getting the correct text to be copied?

I also tried wrapping every word using the accsupp package, like \BeginAccSupp{ActualText=किं}किं\EndAccSupp{} and so on, but that results in complete gibberish.


Viewing all articles
Browse latest Browse all 70

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>