teilab.utils.download_utils module

teilab.utils.download_utils.unzip(path: str, verbose: bool = True) Tuple[str, List[str]][source]

Unzip a zipped file ( Only support the file with .zip extension. )

Parameters
  • path (str) – The path to zipped file.

  • verbose (bool, optional) – Whether to print verbose or not. Defaults to True.

Returns

The directory where the expanded data is stored and the List of their respective file paths.

Return type

Tuple[str,List[str]]

Examples

>>> from teilab.utils import unzip
>>> unzip("target.zip")
teilab.utils.download_utils.decide_extension(content_encoding: Optional[str] = None, content_type: Optional[str] = None, basename: Optional[str] = None)[source]

Decide File Extension based on content_encoding and content_type

Parameters
  • content_encoding (Optional[str], optional) – The MIME type of the resource or the data.

  • content_type (Optional[str], optional) – The Content-Encoding entity header is used to compress the media-type.

  • basename (Optional[str], optional) – The basename.

Returns

The file extension which starts with “.”

Return type

str

Examples

>>> from teilab.utils import decide_extension
>>> decide_extension(content_encoding="image/png")
'.png'
>>> decide_extension(content_type="application/pdf")
'.pdf'
>>> decide_extension(content_encoding="image/webp", content_type="application/pdf")
'.webp'
>>> decide_extension(basename="hoge.zip")
'.zip'
class teilab.utils.download_utils.Downloader[source]

Bases: object

General Downloader

classmethod download_file(url: str, dirname: str = '.', basename: str = '', path: Optional[str] = None, verbose: bool = True, expand: bool = True, **kwargs) str[source]

Download a file and expand it if you want.

Parameters
  • url (str) – The URL of the file you want to download.

  • dirname (str, optional) – The directory where downloaded data will be saved. Defaults to ".".

  • basename (str, optional) – The basename of the target file. Defaults to "".

  • path (Optional[str], optional) – Where and what name to save the downloaded file. Defaults to None.

  • verbose (bool, optional) – Whether to print verbose or not. Defaults to True.

  • expand (bool, optional) – Whether to expand the downloaded file. Defaults to True

Returns

The path to the downloaded file.

Return type

path (str)

Examples

>>> import os
>>> from teilab.utils import Downloader
>>> path = Downloader.download_file(url="http://ui-tei.rnai.jp/")
[Download] URL: http://ui-tei.rnai.jp/
* Content-Encoding : None
* Content-Length   : 32.1 [KB]
* Content-Type     : text/html
* Save Destination : ./2021-06-01@21.30.html
===== Progress =====
2021-06-01@21.30.04 100.0%[####################] 0.0[s] 1.3[MB/s] eta -0.0[s]
Do not support to extract files with the '.html' extension.
>>> os.path.exists(path)
True
static download_target_file(url: str, dirname: str = '.', basename: str = '.', path: Optional[str] = None, bar_width: int = 20, verbose: bool = True, **kwargs) str[source]

Download the target file.

Parameters
  • url (str) – The URL of the file you want to download.

  • dirname (str, optional) – The directory where downloaded data will be saved. Defaults to ".".

  • basename (str, optional) – The basename of the target file. Defaults to "".

  • path (Optional[str], optional) – Where and what name to save the downloaded file. Defaults to None.

  • bar_width (int, optional) – The width of progress bar. Defaults to 20.

  • verbose (bool, optional) – Whether to print verbose or not. Defaults to True.

Returns

The path to the downloaded file.

Return type

path (str)

Examples

>>> import os
>>> from teilab.utils import Downloader
>>> path = Downloader.download_target_file(url="http://ui-tei.rnai.jp/")
[Download] URL: http://ui-tei.rnai.jp/
* Content-Encoding : None
* Content-Length   : 31.8 [KB]
* Content-Type     : text/html
* Save Destination : ./2021-06-01@11.26.html
===== Progress =====
2021-06-01@11.26.48 100.0%[####################] 0.0[s] 1.0[MB/s]   eta -0.0[s]
>>> os.path.exists(path)
True
static prepare_for_download(url: str = '', dirname: str = '.', basename: str = '', path: Optional[str] = None, headers: Optional[Dict[str, str]] = None, verbose: bool = True) Tuple[str, str][source]

Get Information from webfile header and prepare for downloading.

Parameters
  • url (str, optional) – The URL of the file you want to download. Defaults to "".

  • dirname (str, optional) – The directory where downloaded data will be saved. Defaults to ".".

  • basename (str, optional) – The basename of the target file. Defaults to "".

  • path (Optional[str], optional) – Where and what name to save the downloaded file. Defaults to None.

  • headers (Optional[Dict[str,str]], optional) – The header information of the target file. Defaults to {}.

  • verbose (bool, optional) – Whether to print verbose or not. Defaults to True.

Returns

filename and path of the file that will be downloaded.

Return type

Tuple[str,str]

Examples

>>> from teilab.utils import Downloader
>>> filename, path = Downloader.prepare_for_download(
...     url="http://ui-tei.rnai.jp/",
...     basename="index.html",
...     dirname=".",
...     path=None,
>>> )
[Download] URL: http://ui-tei.rnai.jp/
* Content-Encoding : None
* Content-Length   : 32.1 [KB]
* Content-Type     : text/html
* Save Destination : ./index.html
>>> filename, path
('index.html', './index.html')
class teilab.utils.download_utils.GoogleDriveDownloader[source]

Bases: teilab.utils.download_utils.Downloader

Specific Downloader for files in GoogleDrive

CHUNK_SIZE = 32768
DRIVE_URL = 'https://docs.google.com/uc?export=download'
static prepare_for_download(url: str = '', dirname: str = '.', basename: str = '', path: Optional[str] = None, headers: Optional[Dict[str, str]] = None, verbose: bool = True, driveId: Optional[str] = None) Tuple[str, str][source]

Get Information from webfile header and prepare for downloading.

Parameters
  • url (str, optional) – The URL of the file you want to download. Defaults to "".

  • dirname (str, optional) – The directory where downloaded data will be saved. Defaults to ".".

  • basename (str, optional) – The basename of the target file. Defaults to "".

  • path (Optional[str], optional) – Where and what name to save the downloaded file. Defaults to None.

  • headers (Optional[Dict[str,str]], optional) – The header information of the target file. Defaults to {}.

  • verbose (bool, optional) – Whether to print verbose or not. Defaults to True.

Returns

filename and path of the file that will be downloaded.

Return type

Tuple[str,str]

Examples

>>> from teilab.utils import Downloader
>>> filename, path = Downloader.prepare_for_download(
...     url="http://ui-tei.rnai.jp/",
...     basename="index.html",
...     dirname=".",
...     path=None,
>>> )
[Download] URL: http://ui-tei.rnai.jp/
* Content-Encoding : None
* Content-Length   : 32.1 [KB]
* Content-Type     : text/html
* Save Destination : ./index.html
>>> filename, path
('index.html', './index.html')
static download_target_file(url: str, dirname: str = '.', basename: str = '', path: Optional[str] = None, driveId: Optional[str] = None, verbose: bool = True, **kwargs) str[source]

Download the target Google Drive file.

Parameters
  • url (str) – The URL of the file you want to download.

  • dirname (str, optional) – The directory where downloaded data will be saved. Defaults to ".".

  • basename (str, optional) – The basename of the target file. Defaults to "".

  • path (Optional[str], optional) – Where and what name to save the downloaded file. Defaults to None.

  • driveId (Optional[str], optional) – The GoogleDrive’s file ID. Defaults to None.

  • verbose (bool, optional) – Whether to print verbose or not. Defaults to True.

Raises

TypeError – When Google Drive File ID is not detected from driveId and url .

Returns

The path to the downloaded file.

Return type

str

teilab.utils.download_utils.decide_downloader(url: str) teilab.utils.download_utils.Downloader[source]

Decide Downloader from url

Parameters

url (str) – The URL of the file you want to download.

Returns

File Downloader for target url.

Return type

Downloader

Examples

>>> from teilab.utils import decide_downloader
>>> decide_downloader("https://www.dropbox.com/sh/ID").__name__
'Downloader'
>>> decide_downloader("https://drive.google.com/u/0/uc?export=download&id=ID").__name__
'GoogleDriveDownloader'