Hi, welcome to my Data Science Portfolio.

This webpage serves as a showcase for my business problem-solving skills using data science concepts and tools. Here, I demonstrate my ability to tackle real-world challenges by leveraging the power of data.

Discover more about my professional experience, skills, and the tools and concepts I've honed throughout my data science journey. From data exploration and analysis to machine learning and visualization, I provide a comprehensive overview of my expertise in the field.

If you have any inquiries or would like to connect, please don't hesitate to reach out using the provided contact links at the end of this page. I look forward to connecting with you and discussing how we can collaborate in the exciting world of data science.

About me

My name is Débora Craveiro

Currently I am working as a Data Scientist and Machine Learning Engineer at Xpand IT. I am working on a project for one of the biggest portuguese retailers.

I am a self-motivated and highly adaptable person. Data Science crossed paths with my academic journey, and I decided to take it more seriously and embrace a career migration to the field, since my former background is Mechanical Engineering.

In my previous experience as a Data Scientist at Hidromod, I worked with satellite imagery data, to solve crop classification problems, and crop growth forecasting using time-series.

Skills

Programming Languages and Database

Python focused in data analysis
SQL for data extraction
PySpark
Web scraping using Python
Postgres, SQLite databases
JavaScript for Google Earth Engine Platform

Statistics and Machine Learning

Descriptive Statistics (measures of center, measures of spread, skew, kurtosis)
Regression, Classification, and Clustering algorithms
Different methods for balance data, feature selection, and dimensionality reduction
Algorithms performance metrics (RMSE, MAE, MAPE, Confusion Matrix, Precision, Recall, Silhouette Score)
Machine Learning packages: Sklearn, Scipy, and Keras

Data Visualization

Matplotlib, Seaborn, and Plotly
Metabase (early stage)

Software Engineering

Azure DevOps, Azure Databricks, Heroku Cloud
Git, Linux, Cookiecutter, virtual environment, and Docker
Streamlit, Flask, Python API's

Professional Experiences

1+ as Data Scientist

Starting a new project with the Customer Intelligence & Analytics deparment of one of the biggest portuguese retailers.

End to end data science solutions for the Agricultural industry resorting time-series data from satellite imagery. Crop type classification problems, detection of wrongly declared agricultural classes using data analysis, crop growth forecasting, and water level monitoring in reservoirs.

3+ Data Science Projects

Data-driven business solutions, close to the real challenges of the market, using public data provided for Data Science competitions, where I approach the problems from the business problem conception to the publication of the trained algorithms in the production environment using cloud computing tools.

4+ years as Junior Researcher

Since the first year of my master's degree, I have been working with programming languages to solve different kinds of problems. Initially, I worked with Matlab and Maple, and then I migrated to Python. Working as a researcher, I had the opportunity of implementing different types of routines, from simply solving complex algebric problems to advanced optimization techniques (Particle Swarm Optimization, Simulated Annealing, Ant Colony Optimization, Conjugate Gradient, Augmented Lagrangian, etc.).

Data Science Projects

Loyalty Program using Client Clustering

I used Python, statistics and unsupervised learning to segment clients based on their purchase characteristics, with the objective of selecting a group of clients to participate in a loyalty program, in order to increase the total income of the company.
This is an ongoing project.

Tools

Git, Linux
Python, Pandas, Matplotlib, and Seaborn
Jupyter Notebook
KMeans, Gaussian Mixture Model, Hierarquical Clustering, and DBScan
Metabase Visualization (early stage)

Go to project

Web scraping for Jeans Price Prediction

This project aims to price male jeans for a new store on the market. It constitutes web scraping, database manipulation, and ETL design. For price prediction the data available on H&M and Macys websites is being used.
This project is on standby.

Tools

Git, Linux
Python, SQLite
BeautifulSoup, Pandas

Go to project

Sales Prediction for a Drugstore Chain

This project had the objective of predicting the sales of the next six weeks of Rossmann drugstores. A machine learning regression model was used, and the results show an aproximate accuracy of 88%, it has significantly improved the solution when comparing to the established baseline. Using only the validation data, the new model improved on average the solution in plus US$2857 by store.
The sales prediction by store can be accessed anywhere via Telegram.

Tools

Git
Python, Pandas, Numpy, and Seaborn
Anaconda, Pycharm, and Jupyter Notebook
XGBoost Regressor, Random Forest Regressor, Linear Regression, Lasso
Flask, and Python API
Heroku Cloud

Go to project

Real Estate market project to identify opportunities for Reselling

Identify property selling below their average price, and definition of their ideal reselling price based on an exploratory data analysis using Python.
(Unfinished)

Tools

Python, Pandas, Numpy, and Seaborn
Anaconda, Pycharm, and Jupyter Notebook
Interactive Maps with Plotly and Folium
Heroku Cloud
Streamlit Python web framework

Go to project

Contacts

Feel free to get in touch with me.