From command line

Transform molecules into fingerprints

This function can also be used from the command line, just by running the following command along with the arguments explained below.

Specify the fingerprints and the file format to which the user wants the sdf file to be converted.

usage: drug_learning.two_dimensions.main_fingerprints [-h] [-n N_WORKERS]
                                                      [-mo] [-ma] [-rd] [-md]
                                                      [-urd] [-voc VOC] [-c]
                                                      [-pq] [-f] [-hdf] [-pk]
                                                      infile {split} ...

Positional Arguments

infile

Input sdf file(s)

split

Possible choices: split

Named Arguments

-n, --nworkers

Number of workers to parallelize the sdf transform into fingerprints

Default: 1

-mo, --morgan

Convert molecules to Morgan fingerprint

-ma, --maccs

Convert molecules to MACCS fingerprint

-rd, --rdkit

Convert molecules to RDkit fingerprint

-md, --mordred

Convert molecules to Mordred fingerprint

-urd, --unfolded_rdkit

Convert molecules to Unfolded RDkit fingerprint

-voc, --vocabulary

Vocabulary for unfolded rdkit fingerprint

-c, --csv

Save output to csv

Default: False

-pq, --parquet

Save output to parquet

Default: False

-f, --feather

Save output to feather

Default: False

-hdf, --hdf

Save output to hdf

Default: False

-pk, --pickle

Save output to pickle

Default: False

Sub-commands:

split

Split the input sdf files into chunks

drug_learning.two_dimensions.main_fingerprints split [-h] [-ch N_CHUNKS]
                                                     [-nsp NW_SP]
Named Arguments
-ch, --chunk

Number of molecules in each chunk.

Default: 1000

-nsp, --nworkers-sp

Number of workers to parallelize the split into chunks.

Default: 1