llm-dataset-converter release
Version 0.2.7 of our llm_dataset_converter library has been release. New release of ldc_doc, ldc_docx, ldc_faster_whisper, ldc_google, ldc_openai, ldc_pdf and ldc_tint have been made available as well.
The meta-library that combines all the libraries now stands at version 0.0.6:
A new Docker image is available as well:
https://hub.docker.com/r/waikatodatamining/llm-dataset-converter/tags
This release is mostly a maintenance release, but still had some useful additions:
added set-placeholder filter for dynamically setting (temporary) placeholders at runtime
added remove-strings filter that just removes sub-strings
added strip-strings filter for stripping whitespaces from start/end of strings