Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions duui-Climate/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.idea/
target/
venv/
3 changes: 3 additions & 0 deletions duui-Climate/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.idea/
target/
venv*/
90 changes: 90 additions & 0 deletions duui-Climate/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
[![Version](https://img.shields.io/static/v1?label=duui-climate&message=0.1.0&color=blue)](https://docker.texttechnologylab.org/v2/duui-transformers-topic/tags/list)
[![Version](https://img.shields.io/static/v1?label=Python&message=3.12&color=green)]()
[![Version](https://img.shields.io/static/v1?label=Transformers&message=5.9.0&color=yellow)]()
[![Version](https://img.shields.io/static/v1?label=Torch&message=2.11.0&color=red)]()

# Transformers Climate

DUUI implementation for selected Hugging-Face-based transformer [Climate tools](https://huggingface.co/models?sort=trending&search=climatebert) models.
## Included Models

| Name | | Revision | Languages |
|-------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------|--------------------------------|----------|
| distilroberta-base-climate-sentiment | https://huggingface.co/climatebert/distilroberta-base-climate-sentiment | e9f9a94ee4263f5ad5cfc97b8539a497fc88aa7d | EN |
| distilroberta-base-climate-tcfd | https://huggingface.co/climatebert/distilroberta-base-climate-tcfd | 970630beedc21db81a84156448ad2e3ac860153d | EN |
| distilroberta-base-climate-commitment | https://huggingface.co/climatebert/distilroberta-base-climate-commitment | 17337c3292df16a8fe93b1505dfe4122d50a4c91 | EN |
| distilroberta-base-climate-sentiment | https://huggingface.co/climatebert/distilroberta-base-climate-sentiment | e9f9a94ee4263f5ad5cfc97b8539a497fc88aa7d | EN |
| distilroberta-base-climate-specificity | https://huggingface.co/climatebert/distilroberta-base-climate-specificity | 4ada96ed4bf5c3a7a711282e41f1ab9b29f0ddea | EN |

# How To Use

For using duui-climate as a DUUI image it is necessary to use the [Docker Unified UIMA Interface (DUUI)](https://github.com/texttechnologylab/DockerUnifiedUIMAInterface).

## Start Docker container

```
docker run --rm -p 9714:9714 docker.texttechnologylab.org/duui-climate-[modelname]:latest

```

Find all available image tags here: [https://docker.texttechnologylab.org/v2/duui-climate-[modelname]/tags/list](https://docker.texttechnologylab.org/v2/duui-transformers-topic-[modelname]/tags/list)

## Run within DUUI

```
composer.add(
new DUUIDockerDriver.Component("docker.texttechnologylab.org/duui-climate-[modelname]:latest")
.withParameter("selection", "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence")
);
```

### Parameters

| Name | Description |
| ---- | ----------- |
| `selection` | Use `text` to process the full document text or any selectable UIMA type class name |

# Cite

If you want to use the DUUI image please quote this as follows:

Alexander Leonhardt, Giuseppe Abrami, Daniel Baumartz and Alexander Mehler. (2023). "Unlocking the Heterogeneous Landscape of Big Data NLP with DUUI." Findings of the Association for Computational Linguistics: EMNLP 2023, 385–399. [[LINK](https://aclanthology.org/2023.findings-emnlp.29)] [[PDF](https://aclanthology.org/2023.findings-emnlp.29.pdf)]

## BibTeX

```
@inproceedings{Leonhardt:et:al:2023,
title = {Unlocking the Heterogeneous Landscape of Big Data {NLP} with {DUUI}},
author = {Leonhardt, Alexander and Abrami, Giuseppe and Baumartz, Daniel and Mehler, Alexander},
editor = {Bouamor, Houda and Pino, Juan and Bali, Kalika},
booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2023},
year = {2023},
address = {Singapore},
publisher = {Association for Computational Linguistics},
url = {https://aclanthology.org/2023.findings-emnlp.29},
pages = {385--399},
pdf = {https://aclanthology.org/2023.findings-emnlp.29.pdf},
abstract = {Automatic analysis of large corpora is a complex task, especially
in terms of time efficiency. This complexity is increased by the
fact that flexible, extensible text analysis requires the continuous
integration of ever new tools. Since there are no adequate frameworks
for these purposes in the field of NLP, and especially in the
context of UIMA, that are not outdated or unusable for security
reasons, we present a new approach to address the latter task:
Docker Unified UIMA Interface (DUUI), a scalable, flexible, lightweight,
and feature-rich framework for automatic distributed analysis
of text corpora that leverages Big Data experience and virtualization
with Docker. We evaluate DUUI{'}s communication approach against
a state-of-the-art approach and demonstrate its outstanding behavior
in terms of time efficiency, enabling the analysis of big text
data.}
}

@misc{Bagci:2024,
author = {Bagci, Mevlüt},
title = {Hugging-Face-based climate models as {DUUI} component},
year = {2026},
howpublished = {https://github.com/texttechnologylab/duui-uima/tree/main/duui-Climate}
}

```
70 changes: 70 additions & 0 deletions duui-Climate/docker_build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
#!/usr/bin/env bash
set -euo pipefail

export ANNOTATOR_CUDA=
#export ANNOTATOR_CUDA="-cuda"

export ANNOTATOR_NAME=duui-climate
export ANNOTATOR_VERSION=0.1.0
export LOG_LEVEL=DEBUG
export MODEL_CACHE_SIZE=3
export DOCKER_REGISTRY="docker.texttechnologylab.org/"

###---------------------------------------------------------------------
#export MODEL_NAME="climatebert/distilroberta-base-climate-detector"
#export MODEL_SPECNAME="distilroberta-base-climate-detector"
#export MODEL_VERSION="2c3bc660d45a59e31b35f5d3e365ee4f59fdf76c"
#export MODEL_SOURCE="https://huggingface.co/climatebert/distilroberta-base-climate-detector"
#export MODEL_LANG="EN"
###--------------------------------------------------------------------

###---------------------------------------------------------------------
#export MODEL_NAME="climatebert/distilroberta-base-climate-tcfd"
#export MODEL_SPECNAME="distilroberta-base-climate-tcfd"
#export MODEL_VERSION="970630beedc21db81a84156448ad2e3ac860153d"
#export MODEL_SOURCE="https://huggingface.co/climatebert/distilroberta-base-climate-tcfd"
#export MODEL_LANG="EN"
###--------------------------------------------------------------------

###---------------------------------------------------------------------
#export MODEL_NAME="climatebert/distilroberta-base-climate-commitment"
#export MODEL_SPECNAME="distilroberta-base-climate-commitment"
#export MODEL_VERSION="17337c3292df16a8fe93b1505dfe4122d50a4c91"
#export MODEL_SOURCE="https://huggingface.co/climatebert/distilroberta-base-climate-commitment"
#export MODEL_LANG="EN"
###--------------------------------------------------------------------

###---------------------------------------------------------------------
#export MODEL_NAME="climatebert/distilroberta-base-climate-sentiment"
#export MODEL_SPECNAME="distilroberta-base-climate-sentiment"
#export MODEL_VERSION="e9f9a94ee4263f5ad5cfc97b8539a497fc88aa7d"
#export MODEL_SOURCE="https://huggingface.co/climatebert/distilroberta-base-climate-sentiment"
#export MODEL_LANG="EN"
###--------------------------------------------------------------------

##---------------------------------------------------------------------
export MODEL_NAME="climatebert/distilroberta-base-climate-specificity"
export MODEL_SPECNAME="distilroberta-base-climate-specificity"
export MODEL_VERSION="4ada96ed4bf5c3a7a711282e41f1ab9b29f0ddea"
export MODEL_SOURCE="https://huggingface.co/climatebert/distilroberta-base-climate-specificity"
export MODEL_LANG="EN"
##--------------------------------------------------------------------



docker build \
--build-arg ANNOTATOR_NAME \
--build-arg ANNOTATOR_VERSION \
--build-arg LOG_LEVEL \
--build-arg MODEL_CACHE_SIZE \
--build-arg MODEL_NAME \
--build-arg MODEL_VERSION \
--build-arg MODEL_SOURCE \
--build-arg MODEL_LANG \
-t ${DOCKER_REGISTRY}${ANNOTATOR_NAME}"-"${MODEL_SPECNAME}:${ANNOTATOR_VERSION}${ANNOTATOR_CUDA} \
-f src/main/docker/Dockerfile${ANNOTATOR_CUDA} \
.

docker tag \
${DOCKER_REGISTRY}${ANNOTATOR_NAME}"-"${MODEL_SPECNAME}:${ANNOTATOR_VERSION}${ANNOTATOR_CUDA} \
${DOCKER_REGISTRY}${ANNOTATOR_NAME}"-"${MODEL_SPECNAME}:latest${ANNOTATOR_CUDA}
157 changes: 157 additions & 0 deletions duui-Climate/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>org.texttechnologylab.duui</groupId>
<artifactId>duui-climate</artifactId>
<version>0.1.0</version>

<licenses>
<license>
<name>AGPL-3.0-or-later</name>
<url>https://www.gnu.org/licenses/agpl.txt</url>
<distribution>repo</distribution>
<comments>GNU Affero General Public License v3.0 or later</comments>
</license>
</licenses>

<organization>
<name>Texttechnology Lab</name>
<url>https://www.texttechnologylab.org</url>
</organization>
<developers>
<developer>
<id>mehler</id>
<name>Prof. Dr. Alexander Mehler</name>
<email>mehler@em.uni-frankfurt.de</email>
<url>https://www.texttechnologylab.org/team/alexander-abrami/</url>
<organization>Goethe University Frankfurt / Texttechnology Lab</organization>
<organizationUrl>https://www.texttechnologylab.org</organizationUrl>
<roles>
<role>head of department</role>
</roles>
</developer>
<developer>
<id>bagci</id>
<name>Mevlüt Bagci</name>
<email>bagci@em.uni-frankfurt.de</email>
<url>https://www.texttechnologylab.org/team/mevl%c3%bct-bagci/</url>
<organization>Goethe University Frankfurt / Texttechnology Lab</organization>
<organizationUrl>https://www.texttechnologylab.org</organizationUrl>
<roles>
<role>lead developer</role>
</roles>
<timezone>Europe/Berlin</timezone>
</developer>
</developers>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
<configuration>
<argLine>
--illegal-access=permit
--add-opens java.base/java.util=ALL-UNNAMED
<!-- add-opens for use in JUnit-Tests...-->
</argLine>
</configuration>
</plugin>
</plugins>
</build>


<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<dkpro.core.version>2.4.0</dkpro.core.version>
<!-- <ttlab.duui.version>f68ca579ab553074f76d061623dc9b00cf508276</ttlab.duui.version>-->
<!-- <ttlab.typesystem.version>033beaa593a99c005400f4021ea8d6fa8957e6c3</ttlab.typesystem.version>-->
</properties>

<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>

<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.dkpro.core</groupId>
<artifactId>dkpro-core-asl</artifactId>
<version>${dkpro.core.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>

<dependencies>
<!--<dependency>
<groupId>com.github.texttechnologylab</groupId>
<artifactId>DockerUnifiedUIMAInterface</artifactId>
<version>${ttlab.duui.version}</version>
</dependency>-->
<dependency>
<groupId>com.github.mevbagci</groupId>
<artifactId>DockerUnifiedUIMAInterface</artifactId>
<!-- <version>7cef2433b5</version>-->
<!-- <version>30573d4a01</version>-->
<version>ad501be374</version>
</dependency>
<!-- <dependency>-->
<!-- <groupId>com.github.texttechnologylab.textimager-uima</groupId>-->
<!-- <artifactId>textimager-uima-util</artifactId>-->
<!-- <version>${ttlab.textimager.typesystem.version}</version>-->
<!-- </dependency>-->

<dependency>
<groupId>com.github.mevbagci</groupId>
<artifactId>UIMATypeSystem</artifactId>
<version>3.0.23.1</version>
</dependency>

<!-- <dependency>-->
<!-- <groupId>org.texttechnologylab.annotation</groupId>-->
<!-- <artifactId>typesystem</artifactId>-->
<!-- <version>3.0.1</version>-->
<!-- </dependency>-->

<!-- <dependency>-->
<!-- <groupId>org.texttechnologylab</groupId>-->
<!-- <artifactId>DockerUnifiedUIMAInterface</artifactId>-->
<!-- <version>1.3</version>-->
<!-- </dependency>-->

<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<version>5.9.0</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.dkpro.core</groupId>
<artifactId>dkpro-core-api-segmentation-asl</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.dkpro.core</groupId>
<artifactId>dkpro-core-io-xmi-asl</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.dkpro.core</groupId>
<artifactId>dkpro-core-api-resources-asl</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
14 changes: 14 additions & 0 deletions duui-Climate/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
torch==2.11.0
torchaudio==2.11.0
torchvision==0.26.0
scipy==1.17.1
transformers==5.9.0
sentencepiece==0.2.1
protobuf==4.25.3
numpy==2.4.6
scikit-learn==1.8.0
fastapi==0.110.0
dkpro-cassis==0.9.1
uvicorn[standard]==0.27.1
pydantic-settings==2.0.2
torchmetrics==1.2.0
Loading