pyJedAI



An open-source library that leverages Python’s data science ecosystem to build
powerful end-to-end Entity Resolution workflows.




Tests Linux macOS Windows made-with-python

Overview

pyJedAI is a python framework, aiming to offer experts and novice users, robust and fast solutions for multiple types of Entity Resolution problems. It is builded using state-of-the-art python frameworks. pyJedAI constitutes the sole open-source Link Discovery tool that is capable of exploiting the latest breakthroughs in Deep Learning and NLP techniques, which are publicly available through the Python data science ecosystem. This applies to both blocking and matching, thus ensuring high time efficiency, high scalability as well as high effectiveness, without requiring any labelled instances from the user.

Key-Features

  • Input data-type independent. Both structured and semi-structured data can be processed.
  • Various implemented algorithms.
  • Easy-to-use.
  • Utilizes some of the famous and cutting-edge machine learning packages.
  • Offers supervised and un-supervised ML techniques.

Open demos are available in:

       

Google Colab Hands-on demo:

Install

Install the latest version of pyjedai [requires python >= 3.7]:

pip install pyjedai

More on PyPI.

Find last release source code in GitHub.

Tutorials

Tutorial Notebook
Clean-Clean Entity Resolution. CleanCleanER.ipynb
Dirty Entity Resolution. DirtyER.ipynb
Fine-Tuning using Optuna. Optuna.ipynb
User-Friendly Approach. WorkFlow module. WorkFlow.ipynb
Raw data to pandas DataFrame. Readers.ipynb

Dependencies

         


           


See the full list of dependencies and all versions used, in this file.

Bugs, Discussions & News

GitHub Discussions is the discussion forum for general questions and discussions and our recommended starting point. Please report any bugs that you find here.

Team & Authors

pyJedAI

Research and development is made under the supervision of Pr. Manolis Koubarakis. This is a research project by the AI-Team of the Department of Informatics and Telecommunications at the University of Athens.

License

Released under the Apache-2.0 license (see LICENSE.txt).

Copyright © 2022 AI-Team, University of Athens



       

This project is being funded in the context of STELAR, an HORIZON-Europe project.