- support intel X86 and Nvidia GPU
- add more op support for nlp model, like lstm and bert
- add more demos for android/ios/linux/macos/windows
- add const folder in runtime mode
- add github action release for continuous delivery
- optimeze kernel for arm(armv8.2-fp16), opencl(autotune)
- fix some bugs about compilation, tools and runtime execution