Non-Brand Data

Non-Brand Data

Share this post

Non-Brand Data
Non-Brand Data
20 Simple Python Packages to Improve Your Data Workflow
Copy link
Facebook
Email
Notes
More

20 Simple Python Packages to Improve Your Data Workflow

Use these packages to accelerate your job

Cornellius Yudha Wijaya's avatar
Cornellius Yudha Wijaya
Jan 17, 2024
∙ Paid
8

Share this post

Non-Brand Data
Non-Brand Data
20 Simple Python Packages to Improve Your Data Workflow
Copy link
Facebook
Email
Notes
More
1
Share
Generated by DALL·E

Many Python packages have been developed to help data people in their work. In my experience, many valuable data Python packages lack recognition or are still gaining popularity.

In this article, I want to introduce you to various Python packages that would help your data workflow in many ways. Let’s get into it!

Non-Brand Data is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


1. Knockknock

Knockknock is a simple Python package that notifies you when the machine learning model training has finished or crashed. We could get the notification via many channels such as email, Slack, Microsoft Teams, etc.

To install the package, we use the following code.

pip install knockknock

For example, we could use the following code to notify your Gmail email address of your machine learning modeling training status.

from knockknock import email_sender
from sklearn.linear_model import LinearRegression
import numpy as np

@email_sender(recipient_emails=["<your_email@address.com>", "<your_second_email@address.com>"], sender_email="<sender_email@gmail.com>")

def train_linear_model(your_nicest_parameters):    
    x = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
    y = np.dot(x, np.array([1, 2])) + 3 
    regression = LinearRegression().fit(x, y) 

    return regression.score(x, y)

You would get the notification from whatever function you were returning to.

2. Maya

Maya is a Python package for parsing DateTime data as quickly as possible. It uses an easy human-readable interaction to get the DateTime data we want. Let’s start using the package by installing it first.

 pip install maya

Then, we can use the following code to access the current date easily.

import maya
now = maya.now()
print(now)
Image by Author

We could also initiate an object class for Tomorrow date.

tomorrow = maya.when('tomorrow')
tomorrow.datetime()
Image by Author

The package is helpful for any time series activity related, so try it out.

3. category_encoders

category_encoders is a Python package for category data encoding (transformation into numerical data). The package is a collection of various encoding methods that we can apply to multiple categorical data, depending on what we need.

To try out the package, we need to install the package.

pip install category_encoders

Then, we could apply the transformation using the following example.

from category_encoders import BinaryEncoder
import pandas as pd

# use binary encoding to encode two categorical features
enc = BinaryEncoder(cols=['origin']).fit(df)

# transform the dataset
numeric_dataset = enc.transform(df)
numeric_dataset.head()
Image by Author

Keep reading with a 7-day free trial

Subscribe to Non-Brand Data to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Cornellius Yudha Wijaya
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More