gummy.utils.download_utils module

Utility programs for downloading

gummy.utils.download_utils.decide_extension(content_encoding=None, content_type=None, filename=None)[source]

Decide File Extension based on content_encoding and content_type :param content_encoding: The MIME type of the resource or the data. :type content_encoding: str :param content_type: The Content-Encoding entity header is used to compress the media-type. :type content_type: str :param filename: The filename. :type filename: str

Returns

Starts with “.”

Return type

ext (str)

Examples

>>> from gummy.utils import decide_extension
>>> decide_extension(content_encoding="x-gzip", content_type="application/zip")
.gz
>>> decide_extension(content_encoding="image/png", content_type=None)
.png
>>> decide_extension(content_encoding=None, content_type="application/pdf")
.pdf
gummy.utils.download_utils.download_file(url, dirname='.', path=None, bar_width=20, verbose=True)[source]

Download a file. :param url: File URL. :type url: str :param dirname: The directory where downloaded data will be saved. :type dirname: str :param path: path/to/downloaded_file :type path: str :param bar_width: The width of progress bar. :type bar_width: int :param verbose: Whether print verbose or not. :type verbose: bool

Returns

path/to/downloaded_file

Return type

path (str)

Examples

>>> from gummy.utils import download_file
>>> download_file(url="https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_eye.xml")
Download a file from https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_eye.xml
            * Content-Encoding : None
            * Content-Length   : (333.404296875, 'MB')
            * Content-Type     : text/plain; charset=utf-8
            * Save Destination : ./haarcascade_eye.xml
haarcascade_eye.xml     100.0%[####################] 0.1[s] 5.5[GB/s]   eta -0.0[s]
'./haarcascade_eye.xml'
gummy.utils.download_utils.src2base64(src, base=None)[source]

Create base64 encoded img tag from src url or <img> tag element.

Parameters
  • src (str, bs4.element.Tag) – Image src url, or <img> tag element.

  • base (str) – Base URL. Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.

Returns

base64 encoded img tag

Return type

str

Examples

>>> from gummy.utils import src2base64
>>> img_tag = src2base64(src="https://iwasakishuto.github.io/images/contents-icon/Translation-Gummy.png")
>>> with open("sample.html", mode="w") as f:
...     f.write(img_tag)
>>> # open sample.html to check the results.
>>> img_tag = src2base64(src="https://iwasakishuto.github.io/images/XXX/XXXXX.png")
Tried to get an image but got an error: HTTP Error 404: Not Found
>>> with open("error.html", mode="w") as f:
...     f.write(img_tag)
>>> # open sample.html to check the results.
gummy.utils.download_utils.path2base64(path)[source]

Create base64 encoded img tag from local image.

Parameters

path (str) – path/to/image.

Returns

base64 encoded img tag

Return type

str

Examples

>>> from gummy.utils import path2base64, download_file
>>> path = download_file(url="https://iwasakishuto.github.io/images/contents-icon/Translation-Gummy.png")
Download a file from https://iwasakishuto.github.io/images/contents-icon/Translation-Gummy.png
            * Content-Encoding : None
            * Content-Length   : 21.4 [MB]
            * Content-Type     : image/png
            * Save Destination : ./Translation-Gummy.png
Translation-Gummy.png   100.0%[####################] 0.0[s] 3.4[GB/s]   eta -0.0[s]
>>> img_tag = path2base64(path=path)
>>> with open("sample.html", mode="w") as f:
...     f.write(img_tag)
>>> # open sample.html to check the results.
gummy.utils.download_utils.match2path(file, dirname='/Users/iwasakishuto/.gummy')[source]

Match url or path to path while downloading if file is url.

Parameters
  • file (data, str) – url or path or data of PDF.

  • dirname (str) – if file is url, download and save it to dirname. (defalt= GUMMY_DIR)

Returns

path to a PDF.

Return type

str