Spaces:
Running
Running
| title: README | |
| emoji: π | |
| colorFrom: pink | |
| colorTo: gray | |
| sdk: static | |
| pinned: false | |
| # π BigLAM: Machine Learning for Libraries, Archives, and Museums | |
| **BigLAM** is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**. | |
| We aim to: | |
| - ποΈ Share machine-learning-ready datasets from LAMs via the [Hugging Face Hub](https://huggingface.co/biglam) | |
| - π€ Train and release open-source models for LAM-relevant tasks | |
| - π οΈ Develop tools and approaches tailored to LAM use cases | |
| --- | |
| <details> | |
| <summary><strong>β¨ Background</strong></summary> | |
| BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience πΈ](https://bigscience.huggingface.co/) project, a large-scale, open NLP collaboration. | |
| Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data. | |
| </details> | |
| <details> | |
| <summary><strong>π What You'll Find</strong></summary> | |
| The [BigLAM organization](https://huggingface.co/biglam) hosts: | |
| - **Datasets**: image, text, and tabular data from and about libraries, archives, and museums | |
| - **Models**: fine-tuned for tasks like: | |
| - Art/historical image classification | |
| - Document layout analysis and OCR | |
| - Metadata quality assessment | |
| - Named entity recognition in heritage texts | |
| - **Spaces**: tools for interactive exploration and demonstration | |
| </details> | |
| <details> | |
| <summary><strong>π§© Get Involved</strong></summary> | |
| We welcome contributions! You can: | |
| - Use our [datasets and models](https://huggingface.co/biglam) | |
| - Join the discussion on [GitHub](https://github.com/bigscience-workshop/lam/discussions) | |
| - Contribute your own tools or data | |
| - Share your work using BigLAM resources | |
| </details> | |
| ## π Why It Matters | |
| Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by: | |
| - Supporting inclusive and responsible AI | |
| - Helping institutions experiment with ML for access, discovery, and preservation | |
| - Ensuring that ML systems reflect diverse human knowledge and expression | |
| - Developing tools and methods that work well with the unique formats, values, and needs of LAMs | |