file_handler package

Submodules

file_handler.analysis_target module

class credsweeper.file_handler.analysis_target.AnalysisTarget(line, line_num, lines, file_path)[source]

Bases: object

file_path: str
line: str
line_num: int
lines: List[str]

file_handler.content_provider module

class credsweeper.file_handler.content_provider.ContentProvider(file_path, change_type=None, diff=None)[source]

Bases: abc.ABC

Base class to provide access to analysis targets for scanned object.

abstract get_analysis_target()[source]

Load and preprocess file diff data to scan.

Return type

List[AnalysisTarget]

Returns

row objects to analysing

file_handler.diff_content_provider module

class credsweeper.file_handler.diff_content_provider.DiffContentProvider(file_path, change_type, diff)[source]

Bases: credsweeper.file_handler.content_provider.ContentProvider

Provide data from a single .patch file.

Parameters
  • file_path (str) – path to file

  • change_type (str) – set added or deleted file data to scan

  • diff (List[Dict]) –

    list of file row changes, with base elements represented as:

    {
        "old": line number before diff,
        "new": line number after diff,
        "line": line text,
        "hunk": diff hunk number
    }
    

get_analysis_target()[source]

Preprocess file diff data to scan.

Return type

List[AnalysisTarget]

Returns

list of analysis targets of every row of file diff corresponding to change type “self.change_type”

parse_lines_data(lines_data)[source]

Parse diff lines data.

Return list of line numbers with change type “self.change_type” and list of all lines in file

in original order(replaced all lines not mentioned in diff file with blank line)

Parameters

lines_data (List[DiffRowData]) – data of all rows mentioned in diff file

Return type

Tuple[List[int], List[str]]

Returns

tuple of line numbers with change type “self.change_type” and all file lines in original order(replaced all lines not mentioned in diff file with blank line)

file_handler.file_path_extractor module

class credsweeper.file_handler.file_path_extractor.FilePathExtractor[source]

Bases: object

classmethod apply_gitignore(detected_files)[source]

Apply gitignore rules for each file.

Parameters

detected_files (List[str]) – list of files to be checked

Return type

List[str]

Returns

List of files with all files ignored by git removed

classmethod check_exclude_file(config, path)[source]
Return type

bool

classmethod get_file_paths(config, path)[source]

Get all files in the directory. Automatically exclude files non-code or data files (such as .jpg).

Parameters
  • config (Config) – credsweeper configuration

  • path (str) – path to the file or directory to be scanned

Return type

List[str]

Returns

List all non-excluded files in the directory

classmethod is_valid_path(path)[source]

Locate nearest .git directory to the path and check if path is ignored.

Parameters

path (str) – path to the file or directory to check

Return type

bool

Returns

False if file is ignored by git. True otherwise

located_repos = {}

file_handler.files_provider module

class credsweeper.file_handler.files_provider.FilesProvider(paths, change_type=None, skip_ignored=None)[source]

Bases: abc.ABC

Base class for all files provider objects.

Parameters
  • paths (List[str]) – list of paths to scan

  • change_type (Optional[str]) – type of analyses changes in patch (added or deleted)

  • skip_ignored (Optional[bool]) – Checking the directory to the list of ignored directories from the gitignore file

abstract get_scannable_files(config)[source]

Get list of file object for analysis based on attribute “paths”.

Parameters

config (Dict) – dict of credsweeper configuration

Return type

List[ContentProvider]

Returns

file objects to analyse

file_handler.patch_provider module

class credsweeper.file_handler.patch_provider.PatchProvider(paths, change_type=None, skip_ignored=None)[source]

Bases: credsweeper.file_handler.files_provider.FilesProvider

Provide data from a list of .patch files.

Allows to scan for data that has changed between git commits, rather than the entire project.

Parameters
  • paths (List[str]) – file paths list to scan. All files should be in .patch format

  • change_type (Optional[str]) – string, type of analyses changes in patch (added or deleted)

  • skip_ignored (Optional[bool]) – boolean variable, Checking the directory to the list of ignored directories from the gitignore file

get_files_sequence(raw_patches)[source]
Return type

List[DiffContentProvider]

get_scannable_files(config)[source]

Get files to scan. Output based on the paths field.

Parameters

config (Dict) – dict of credsweeper configuration

Return type

List[DiffContentProvider]

Returns

file objects for analysing

load_patch_data()[source]
Return type

List[List[str]]

file_handler.text_content_provider module

class credsweeper.file_handler.text_content_provider.TextContentProvider(file_path, change_type=None, diff=None)[source]

Bases: credsweeper.file_handler.content_provider.ContentProvider

Provide access to analysis targets for full-text file scanning.

Parameters

file_path (str) – string, path to file

get_analysis_target()[source]

Load and preprocess file content to scan.

Return type

List[AnalysisTarget]

Returns

list of analysis targets based on every row in file

file_handler.text_provider module

class credsweeper.file_handler.text_provider.TextProvider(paths, change_type=None, skip_ignored=None)[source]

Bases: credsweeper.file_handler.files_provider.FilesProvider

Provider of full text files analysing.

Parameters
  • paths (List[str]) – list of string, list of parent path of files to scan

  • change_type (Optional[str]) – string, type of analyses changes in patch (added or deleted)

  • skip_ignored (Optional[bool]) – boolean variable, Checking the directory to the list of ignored directories from the gitignore file

get_files_sequence(file_paths)[source]
Return type

List[TextContentProvider]

get_scannable_files(config)[source]

Get list of full text file object for analysis of files with parent paths from “paths”.

Parameters

config (Dict) – dict of credsweeper configuration

Return type

List[TextContentProvider]

Returns

preprocessed file objects for analysis

Module contents

class credsweeper.file_handler.ByteContentProvider(content, file_path=None)[source]

Bases: credsweeper.file_handler.content_provider.ContentProvider

Allow to scan byte sequence.

Parameters
  • content (bytes) – byte sequence to be scanned.Would be automatically split into an array of lines in a new line character is present

  • file_path (Optional[str]) – optional string. Might be specified if you know true file name lines was taken from

get_analysis_target()[source]

Return lines to scan.

Return type

List[AnalysisTarget]

Returns

list of analysis targets based on every row in a content

class credsweeper.file_handler.ContentProvider(file_path, change_type=None, diff=None)[source]

Bases: abc.ABC

Base class to provide access to analysis targets for scanned object.

abstract get_analysis_target()[source]

Load and preprocess file diff data to scan.

Return type

List[AnalysisTarget]

Returns

row objects to analysing

class credsweeper.file_handler.DiffContentProvider(file_path, change_type, diff)[source]

Bases: credsweeper.file_handler.content_provider.ContentProvider

Provide data from a single .patch file.

Parameters
  • file_path (str) – path to file

  • change_type (str) – set added or deleted file data to scan

  • diff (List[Dict]) –

    list of file row changes, with base elements represented as:

    {
        "old": line number before diff,
        "new": line number after diff,
        "line": line text,
        "hunk": diff hunk number
    }
    

get_analysis_target()[source]

Preprocess file diff data to scan.

Return type

List[AnalysisTarget]

Returns

list of analysis targets of every row of file diff corresponding to change type “self.change_type”

parse_lines_data(lines_data)[source]

Parse diff lines data.

Return list of line numbers with change type “self.change_type” and list of all lines in file

in original order(replaced all lines not mentioned in diff file with blank line)

Parameters

lines_data (List[DiffRowData]) – data of all rows mentioned in diff file

Return type

Tuple[List[int], List[str]]

Returns

tuple of line numbers with change type “self.change_type” and all file lines in original order(replaced all lines not mentioned in diff file with blank line)

class credsweeper.file_handler.StringContentProvider(lines, file_path=None)[source]

Bases: credsweeper.file_handler.content_provider.ContentProvider

Allow to scan array of lines.

Parameters
  • lines (List[str]) – lines to be processed

  • file_path (Optional[str]) – optional string. Might be specified if you know true file name lines was taken from

get_analysis_target()[source]

Return lines to scan.

Return type

List[AnalysisTarget]

Returns

list of analysis targets based on every row in file

class credsweeper.file_handler.TextContentProvider(file_path, change_type=None, diff=None)[source]

Bases: credsweeper.file_handler.content_provider.ContentProvider

Provide access to analysis targets for full-text file scanning.

Parameters

file_path (str) – string, path to file

get_analysis_target()[source]

Load and preprocess file content to scan.

Return type

List[AnalysisTarget]

Returns

list of analysis targets based on every row in file