A TYPO3 CMS extension that provides Apache Tika functionality including
- text extraction
- meta data extraction
- language detection (from strings or files)
Tika can be used as standalone Tika app/jar, Tika server, and via SolrCell integrated in Apache Solr.
We're open for contributions !
Please find further information regarding Apache Tika on the project's homepage
We use GitHub Actions for continuous integration.
To run the test suite locally, please use our DDEV docker environment https://github.com/TYPO3-Solr/solr-ddev-site.
Note: This requires a proper combination of branches:
- solr-ddev-site on release-12.0.x branch
- packages/ext-solr on release-12.0.x
- packages/ext-tika on release-12.0.x
- Please refer to version matrix for proper combination of branches
ddev enable tika
ddev tests-unit-tika
ddev tests-integration-tika
- Fork the repository
- Clone repository
- Create a new branch
- Make your changes
- Commit your changes to your fork. In your commit message refer to the issue number if there is already one, e.g.
[BUGFIX] short description of fix (resolves #4711)
- Submit a Pull Request (here are some hints on How to write the perfect pull request)
- git remote add upstream https://github.com/TYPO3-Solr/ext-tika.git
- git fetch upstream
- git checkout master
- git merge upstream/master
- git push origin master