Add example for documentation #106

Stargateur · 2017-06-06T20:43:55Z

I didn't find any example.

This would be possible to add a folder example with all necessary example to print, iterate, create, read a UTF-8 string with this library ?

stevengj · 2017-06-06T20:49:15Z

Examples would be welcome, but sounds like they should go into the README or into the manual?

stevengj · 2017-06-06T20:51:37Z

Note that utf8proc does not handle printing of UTF-8 strings. To print a UTF-8 string you can just use printf (caveat: on Windows, you need to set the terminal to the UTF8 codepage). UTF-8 strings can be created in any decent text editors (since most text editors can be set to edit in UTF-8 mode). And reading a UTF-8 string is also something that you can do with standard C library functions. UTF-8 strings are just bytes read from a e.g. file in the UTF-8 encoding.

stevengj · 2017-06-06T20:54:42Z

The main purpose of this library is for things like Unicode normalization, case-folding, etcetera, that require Unicode data tables. There are also functions to encode/decode Unicode codepoints to/from UTF-8, as described in the manual — maybe that is what you mean by "creating" and "reading" UTF-8 strings?

Stargateur · 2017-06-06T21:23:34Z

Examples would be welcome, but sounds like they should go into the README or into the manual?

Would be perfect too.

Note that utf8proc does not handle printing of UTF-8 strings. To print a UTF-8 string you can just use printf (caveat: on Windows, you need to set the terminal to the UTF8 codepage). UTF-8 strings can be created in any decent text editors (since most text editors can be set to edit in UTF-8 mode). And reading a UTF-8 string is also something that you can do with standard C library functions. UTF-8 strings are just bytes read from a e.g. file in the UTF-8 encoding.

I know but this could really help beginner to understand basic use of the library. Like you said that for example the user has to read and write string him/herself.

My issue come from a question in stack overflow, this one. I have been unable to provide an answer because I didn't understand how to use this library.

I try this but I'm sure that it's not the way to do it:

#include <stdio.h>
#include <utf8proc.h>
#include <unistd.h>

int main(void) {
  utf8proc_uint8_t const string[6] = "\xe4\xb8\xad\xe6\x96\x87"; // or this u8"ايه الاخبار"
  utf8proc_ssize_t size = sizeof string / sizeof *string;
  utf8proc_int32_t data;
  utf8proc_ssize_t n;

  utf8proc_uint8_t const *pstring = string;
  while ((n = utf8proc_iterate(pstring, size, &data)) > 0) {
    printf("%.*s\n", (int)n, pstring);
    pstring += n;
    size -= n;
  }
}

cesss · 2018-01-07T11:04:00Z

@Stargateur : First of all, your code has an important error that will make it fail no matter the libraries you use for UTF-8: You statically allocate 6 bytes for a string made of 6 bytes. That's not correct. Strings in the C language are null-terminated: They need a zero byte at the end. So, you need to allocate 7 bytes for a string that has 6 bytes of data. For static allocation, the compiler can do this automatically for you, if you leave empty the string length between brackets. Read any good chapter about strings in a good C language book, and you'll learn all of this.

Second, you don't need utf8proc for declaring a UTF-8 string and printing it.

In your case, your code could be reduced to something as simple as this: only two lines:

const char string[]="ايه الاخبار"; /* no need to prepend "u8" if the file is encoded as UTF-8 with no BOM */
printf("%s\n",string);

As simple as that.

giampaolo · 2018-01-25T20:07:29Z

I agree a directory of examples would be great to have!
https://julialang.org/utf8proc/doc/ is an API reference, which is very different from a documentation, a tutorial or an "example usage" section in the README, which is probably the most immediate way to get something working ASAP.

niblo · 2021-01-03T09:01:44Z

(I was about to create a new issue, but this one seems to be a good fit.)

I'm also looking for examples.

What I'm trying to do is implement an iterator function that iterates over graphemes (in C). I'm implementing it as a patch to utf8proc to piggyback on the test infrastructure, but I'm seeing some odd results.

One reference that I haven't looked at yet is the graphemes function in Julia, but I don't know the Julia language.

Do you know of any other references, or perhaps an existing implementation?

jamesBrosnahan · 2021-07-18T21:34:14Z

I agree a directory of examples would be great to have!
https://julialang.org/utf8proc/doc/ is an API reference, which is very different from a documentation, a tutorial or an "example usage" section in the README, which is probably the most immediate way to get something working ASAP.

The link 404s.

wolfield · 2022-07-14T13:17:15Z

Bump to this.

stevengj added the documentation label Mar 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example for documentation #106

Add example for documentation #106

Stargateur commented Jun 6, 2017

stevengj commented Jun 6, 2017

stevengj commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

Stargateur commented Jun 6, 2017 •

edited

Loading

cesss commented Jan 7, 2018

giampaolo commented Jan 25, 2018 •

edited

Loading

niblo commented Jan 3, 2021 •

edited

Loading

jamesBrosnahan commented Jul 18, 2021

wolfield commented Jul 14, 2022

Add example for documentation #106

Add example for documentation #106

Comments

Stargateur commented Jun 6, 2017

stevengj commented Jun 6, 2017

stevengj commented Jun 6, 2017 • edited Loading

stevengj commented Jun 6, 2017 • edited Loading

Stargateur commented Jun 6, 2017 • edited Loading

cesss commented Jan 7, 2018

giampaolo commented Jan 25, 2018 • edited Loading

niblo commented Jan 3, 2021 • edited Loading

jamesBrosnahan commented Jul 18, 2021

wolfield commented Jul 14, 2022

stevengj commented Jun 6, 2017 •

edited

Loading

stevengj commented Jun 6, 2017 •

edited

Loading

Stargateur commented Jun 6, 2017 •

edited

Loading

giampaolo commented Jan 25, 2018 •

edited

Loading

niblo commented Jan 3, 2021 •

edited

Loading