Hi there! 👋 I'm Cristian, a Systems Engineer bridging the gap between Data Science and Full-Stack Web Development.
I specialize in building intelligent applications—designing robust backend systems with Python, crafting intuitive user interfaces with React, and deploying machine learning models into production environments using Docker and Cloud services (GCP & Azure).
💡 What drives me: I love turning complex datasets into actionable insights and real-time interactive tools. I'm a self-motivated learner, constantly exploring new technologies to solve real-world problems.
🏆 Certifications: Certified IBM SkillsBuild Data Analytics & Google Cloud Computing Foundations.
🌍 Check out my Portfolio | 📊 Find me on Kaggle
- Programming languages: Python, JavaScript, Java, R, SQL
- Backend: FastAPI, Django REST, Express.js, Flask
- Frontend: React + Vite, HTML, CSS
- Data: Pandas, PySpark, dbt, Apache Airflow, Matplotlib, Seaborn
- AI: Scikit-learn, PyTorch, MLlib
- Databases: PostgreSQL, MySQL, Firestore
- DevOps: Docker, Git, GitHub Actions
- Testing: Pytest, Playwright, Jest, Selenium IDE
- BI: Power BI, Microsoft Excel
- Cloud providers: Google Cloud, Microsoft Azure
- Developed a machine learning system to assist in PE diagnosis.
- Executed the entire data science pipeline: exploratory data analysis (EDA) on source data, cleaning and preprocessing training/validation data, and model tuning.
- Assessed model performance using cross-validation and an external dataset to verify model generalization capabilities.
- Analyzed model outputs using SHAP explainers alongside a lightweight LIME approach optimized for the production environment.
- Developed a web application to manage diagnoses and interact with the machine learning model in real-time.
- Stack: React, FastAPI, ONNX, Firebase, Docker, Scikit-learn, GitHub Actions, Google Drive API, reCAPTCHA, SHAP, LIME, Pytest, Jest
- Repositories: Frontend, Backend, and Model Development
- Trained and evaluated four neural network architectures (MLP, RNN, LSTM, GRU) and Transformers for a multiclass classification task.
- Processed and analyzed over 1.2 million Amazon reviews using NLP techniques and different word representations (TF-IDF and embeddings).
- Compared the performance of TF-IDF, self-trained embeddings, and embeddings pre-trained on the Spanish Billion Corpus.
- Conducted a comprehensive performance assessment to select the most accurate architecture for review classification.
- Stack: PyTorch, Pandas, Transformers, NLTK, FastText, gensim
- Repositories: Models
- Developed a web application for creating and customizing English learning activities tailored to the user’s proficiency level and preferred topics.
- Implemented interactive features such as dictation, pronunciation practice, guided conversations, and dynamic quizzes.
- Integrated AI services for automated content generation and evaluation, including pronunciation analysis and educational content creation.
- Implemented user authentication with Firebase Auth, data management with Firestore, and integrations with the Gemini and Microsoft Azure Speech APIs.
- Stack: React, Firebase, Microsoft Azure Cognitive Services, Gemini API
- Repositories: Application
- Designed and executed functional tests for invoicing, ticketing, and MRP modules, applying black-box techniques such as equivalence partitioning and decision tables using Playwright (Python).
- Developed unit tests with PHPUnit for critical system classes, applying statement, branch, and condition coverage criteria to achieve over 90% code coverage.
- Evaluated non-functional quality attributes, focusing on accessibility and usability via WCAG 2.2-based heuristics validated with Google Lighthouse.
- Performed load and performance testing using Apache JMeter within a Dockerized environment.
- Stack: Playwright, PHPUnit, Apache JMeter, Google Lighthouse
- Repositories: Tests
- Built an end-to-end ETL pipeline for the "Adventure Works" database, applying data cleaning, structural normalization, and transformation logic.
- Integrated Azure Cognitive Services (Translation API) for multilingual data processing.
- Designed and published interactive reports in Power BI to track and visualize strategic business KPIs.
- Stack: SQL Server, PostgreSQL, Pandas, Microsoft Azure Cognitive Services, Power BI
- Repositories: ETL
- Conducted EDA on the Vinho Verde dataset, verifying variable distributions, identifying outliers, and analyzing attribute correlations.
- Trained and evaluated multiclass classification models using algorithms like KNN, SVM, Random Forests, and Neural Networks.
- Applied feature engineering methods and oversampling techniques to resolve class imbalance and optimize classifier performance.
- Stack: Python, Pandas, Scikit-learn, Matplotlib
- Repositories: Models



