20 Simple Python Packages to Improve Your Data Workflow
Use these packages to accelerate your job
Many Python packages have been developed to help data people in their work. In my experience, many valuable data Python packages lack recognition or are still gaining popularity.
In this article, I want to introduce you to various Python packages that would help your data workflow in many ways. Let’s get into it!
1. Knockknock
Knockknock is a simple Python package that notifies you when the machine learning model training has finished or crashed. We could get the notification via many channels such as email, Slack, Microsoft Teams, etc.
To install the package, we use the following code.
pip install knockknock
For example, we could use the following code to notify your Gmail email address of your machine learning modeling training status.
from knockknock import email_sender
from sklearn.linear_model import LinearRegression
import numpy as np
@email_sender(recipient_emails=["<your_email@address.com>", "<your_second_email@address.com>"], sender_email="<sender_email@gmail.com>")
def train_linear_model(your_nicest_parameters):
x = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(x, np.array([1, 2])) + 3
regression = LinearRegression().fit(x, y)
return regression.score(x, y)
You would get the notification from whatever function you were returning to.
2. Maya
Maya is a Python package for parsing DateTime data as quickly as possible. It uses an easy human-readable interaction to get the DateTime data we want. Let’s start using the package by installing it first.
pip install maya
Then, we can use the following code to access the current date easily.
import maya
now = maya.now()
print(now)
We could also initiate an object class for Tomorrow date.
tomorrow = maya.when('tomorrow')
tomorrow.datetime()
The package is helpful for any time series activity related, so try it out.
3. category_encoders
category_encoders is a Python package for category data encoding (transformation into numerical data). The package is a collection of various encoding methods that we can apply to multiple categorical data, depending on what we need.
To try out the package, we need to install the package.
pip install category_encoders
Then, we could apply the transformation using the following example.
from category_encoders import BinaryEncoder
import pandas as pd
# use binary encoding to encode two categorical features
enc = BinaryEncoder(cols=['origin']).fit(df)
# transform the dataset
numeric_dataset = enc.transform(df)
numeric_dataset.head()
Keep reading with a 7-day free trial
Subscribe to Non-Brand Data to keep reading this post and get 7 days of free access to the full post archives.