Non-Brand Data

Non-Brand Data

20 Simple Python Packages to Improve Your Data Workflow

Use these packages to accelerate your job

Cornellius Yudha Wijaya's avatar
Cornellius Yudha Wijaya
Jan 17, 2024
∙ Paid
Generated by DALL·E

Many Python packages have been developed to help data people in their work. In my experience, many valuable data Python packages lack recognition or are still gaining popularity.

In this article, I want to introduce you to various Python packages that would help your data workflow in many ways. Let’s get into it!

Non-Brand Data is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


1. Knockknock

Knockknock is a simple Python package that notifies you when the machine learning model training has finished or crashed. We could get the notification via many channels such as email, Slack, Microsoft Teams, etc.

To install the package, we use the following code.

pip install knockknock

For example, we could use the following code to notify your Gmail email address of your machine learning modeling training status.

from knockknock import email_sender
from sklearn.linear_model import LinearRegression
import numpy as np

@email_sender(recipient_emails=["<your_email@address.com>", "<your_second_email@address.com>"], sender_email="<sender_email@gmail.com>")

def train_linear_model(your_nicest_parameters):    
    x = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
    y = np.dot(x, np.array([1, 2])) + 3 
    regression = LinearRegression().fit(x, y) 

    return regression.score(x, y)

You would get the notification from whatever function you were returning to.

2. Maya

Maya is a Python package for parsing DateTime data as quickly as possible. It uses an easy human-readable interaction to get the DateTime data we want. Let’s start using the package by installing it first.

 pip install maya

Then, we can use the following code to access the current date easily.

import maya
now = maya.now()
print(now)
Image by Author

We could also initiate an object class for Tomorrow date.

tomorrow = maya.when('tomorrow')
tomorrow.datetime()
Image by Author

The package is helpful for any time series activity related, so try it out.

3. category_encoders

category_encoders is a Python package for category data encoding (transformation into numerical data). The package is a collection of various encoding methods that we can apply to multiple categorical data, depending on what we need.

To try out the package, we need to install the package.

pip install category_encoders

Then, we could apply the transformation using the following example.

from category_encoders import BinaryEncoder
import pandas as pd

# use binary encoding to encode two categorical features
enc = BinaryEncoder(cols=['origin']).fit(df)

# transform the dataset
numeric_dataset = enc.transform(df)
numeric_dataset.head()
Image by Author
User's avatar

Continue reading this post for free, courtesy of Cornellius Yudha Wijaya.

Or purchase a paid subscription.
© 2026 Cornellius Yudha Wijaya · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture