Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FcPatternGetLangSet exposed #16

Merged
merged 6 commits into from
Mar 17, 2022
Merged

FcPatternGetLangSet exposed #16

merged 6 commits into from
Mar 17, 2022

Conversation

makemeunsee
Copy link
Contributor

@makemeunsee makemeunsee commented Mar 13, 2022

Hello,
I was looking for a way to filter fonts by supported languages, and FcPatternGetLangSet is just that. Hopefully this contribution is welcome, looking forward to any feedback.
Cheers.

PS: usage sample code:

use std::{
    collections::{BTreeMap, HashSet},
    ffi::CStr,
};

use fontconfig::{Fontconfig, Pattern};

pub fn japanese_font_families_and_styles(fc: &Fontconfig) -> BTreeMap<String, HashSet<String>> {
    fontconfig::list_fonts(&Pattern::new(&fc), None)
        .iter()
        .filter(|p| p.get_lang_set().unwrap().contains(&"ja"))
        .fold(
            BTreeMap::new(),
            |mut acc: BTreeMap<String, HashSet<String>>, p| {
                let family = p
                    .get_string(CStr::from_bytes_with_nul(b"family\0").unwrap())
                    .unwrap()
                    .to_owned();
                let set = acc.entry(family).or_default();
                match p.get_string(CStr::from_bytes_with_nul(b"style\0").unwrap()) {
                    Some(style) => set.insert(style.to_owned()),
                    None => false,
                };
                acc
            },
        )
}

}
let cstr = CStr::from_ptr(lang_str as *const c_char);
res.push(cstr.to_str().unwrap());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator created by FcStrListCreate is currently leaked. FcStrListDone needs to be called when done with the iterator. That can't be done in this method because it returns strings yielded by the iterator. I think the API needs to be reworked to be like FontSet:

  • Create a StrList wrapper type to hold the the FcStrList
  • Implement from_raw, iter, and drop methods on it
  • In the drop impl call FcStrListDone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers! I'll give it a try and update the PR.

Copy link
Contributor Author

@makemeunsee makemeunsee Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing the obvious, what would be wrong with copying each lang_str and storing it as an owned String rather than providing references with little guarantee of validity or lifetime?
Client code would need to perform a copy anyway, before eventually calling FcStrListDone through LangSet.drop.
Impl would look so:

pub fn get_lang_set(& self) -> Option<Vec<String>> {
        unsafe {
            let mut ret: *mut sys::FcLangSet = ptr::null_mut();
            if ffi_dispatch!(LIB, FcPatternGetLangSet, self.pat, FC_LANG.as_ptr(), 0, &mut ret as *mut _)
                == sys::FcResultMatch
            {
                let ss: *mut sys::FcStrSet = ffi_dispatch!(LIB, FcLangSetGetLangs, ret);
                let lang_strs: *mut sys::FcStrList = ffi_dispatch!(LIB, FcStrListCreate, ss);
                let mut res = vec![];
                loop {
                    let lang_str: *mut sys::FcChar8 = ffi_dispatch!(LIB, FcStrListNext, lang_strs);
                    if lang_str.is_null() {
                        break;
                    }
                    let cstr = CStr::from_ptr(lang_str as *const c_char);
                    res.push(cstr.to_str().unwrap().clone().to_owned());
                }
                ffi_dispatch!(LIB, FcStrListDone, lang_strs);
                Some(res)
            } else {
                None
            }
        }
    }

(Strings could also be pushed conditionally to being Ok to unwrap)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be wrong with copying each lang_str and storing it as an owned String

Nothing wrong per se, it's just some allocation and copying that can be avoided.

rather than providing references with little guarantee of validity or lifetime?

With the iterator option the lifetime of the strings that the iterator outputs would be tied to the StrList wrapper so they would remain valid while the StrList remained alive.

Client code would need to perform a copy anyway

Not necessarily, as in your original example you might only want to compare the strings to a known value ( .filter(|p| p.get_lang_set().unwrap().contains(&"ja")))

Here's a rough sketch of what I'm imagining. If you don't think it's worth pursing then we can go with the vector of owned strings.

pub struct StrList<'a> {
    list: *mut FcStrList,
    _life: &'a PhantomData
}

impl Pattern {
    pub fn lang_set(&self) -> Option<StrList<'_>> {
        unsafe {
            let mut ret: *mut sys::FcLangSet = ptr::null_mut();
            if ffi_dispatch!(LIB, FcPatternGetLangSet, self.pat, FC_LANG.as_ptr(), 0, &mut ret as *mut _) == sys::FcResultMatch {
                let ss: *mut sys::FcStrSet = ffi_dispatch!(LIB, FcLangSetGetLangs, ret);
                let lang_strs: *mut sys::FcStrList = ffi_dispatch!(LIB, FcStrListCreate, ss);
                Some(StrList::from_raw(self.fc, lang_strs))
            } else {
                None
            }
        }
    }
}

impl<'a> StrList<'a> {
    unsafe fn from_raw(_: &Fontconfig, raw_list: *mut sys::FcStrSet) -> Self {
        Self { list: raw_list, _life: &PhantomData }
    }
}

impl<'a> Drop for StrList<'a> {
    fn drop(self) {
        unsafe { ffi_dispatch!(LIB, FcStrListDone, self.list) };
    }
}

impl<'a> Iterator<Output = &'a str> {
    fn next(&mut self) -> Option<&'a str> {
        let lang_str: *mut sys::FcChar8 = unsafe { ffi_dispatch!(LIB, FcStrListNext, self.list) };
        if lang_str.is_null() {
            None
        }
        else {
            Some(CStr::from_ptr(lang_str as *const c_char))
        }
    }
}

Copy link
Contributor Author

@makemeunsee makemeunsee Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the constructive discussion, it's very much appreciated!

Indeed, I realized only later than my own example doesn't require owned strings at all. I updated the PR with an impl based on your sketch.

The PhantomData for the lifetime is a nice trick to know as well.

Now there's just one bit I'm not too sure about, I'll start a thread about it.

@alfiedotwtf
Copy link

LGTM

@wezm wezm merged commit f8d705c into yeslogic:master Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants