-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IVS in string searching #24
base: gh-pages
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for w3c-string-search ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start, but I'd change where this is located. Consider making some graphics to illustrate IVS.
@@ -97,7 +97,11 @@ <h3>Terminology</h3> | |||
<p>Frequently this means that a <a>full-text search</a> employs indexes and natural language processing. When you are using a search engine, you are using a form of full text search. Full text search often breaks natural language text into words or phrases (this is called <a>segmentation</a>) and may apply complex processing to get at the semantic "root" values of words (this is called <a>stemming</a>). These processes are sensitive to language, context, and many other aspects of textual variation.</p> | |||
|
|||
<p class="definition"><dfn data-lt="natural language processing|NLP">Natural Language Processing</dfn> (<abbr title="natural language processing">NLP</abbr>) refers to the domain of software designed to understand, process, and manipulate human languages (that is, <a>natural language</a>). This is a very wide ranging term. It can cover relatively simple problems, such as word tokenization, or more complex behaviors, such as deriving "meaning" from text, recognizing parts of speech, performing accurate translation, and much else.</p> | |||
|
|||
|
|||
<p class="definition"><dfn data-lt="Ideographic Variation Sequences|IVS">Ideographic Variation Sequences</dfn> refer to a <a href="https://www.unicode.org/glossary/#variation_sequence" target="_blank">variation sequence</a> registered in the <a href="https://www.unicode.org/ivd/" target="_blank">Ideographic Variation Database</a>. The base character for an ideographic variation sequence must be an ideographic character, and it makes use of a variation selector in the range U+E0100..U+E01EF. The term ideographic variation sequence is sometimes abbreviated as "IVS". See also <a href="https://www.unicode.org/glossary/#ideographic_variation_sequence" target="_blank">Unicode definition</a> and <a href="https://www.unicode.org/reports/tr37/" target="_blank">UTS #37</a>.</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turn UTS #37 from link to reference as in following line?? (since target is technical report, and possibly referred as reference)
Fix #21.