review_recommender package

Submodules

review_recommender.data_retriveal module

class review_recommender.data_retriveal.RepoRetriveal(owner, repo, token=None)[source]

Bases: object

Class to download pulls and commit from a GitHub repository.

class Commit(sha: str, author_login: str, filesInfo: list, date: datetime)[source]

Bases: object

A commit in a repository identified by its sha. Can acces login name of the author and date.

class PullRequest(number: int, author_login: str, reviewers: set[str], date: datetime)[source]

Bases: object

A pull request in a repository identified by its number. Can access login name of the author, set of reviewers and date.

class RepoFile(filepath: str, content: str)[source]

Bases: object

A text file in a repository. Can access its path and its content.

getCommitBySha(sha)[source]

Returns a RepoRetriveal.Commit instance from its sha.

Parameters

sha (string) – The sha of the commit.

Raises

requests.HTTPError – If there is no pull with this sha.

getCommitFiles(commit: Commit)[source]

Returns a list of RepoRetriveal.RepoFile associated with a commit.

Parameters

pull (RepoRetriveal.Commit) – the commit.

Raises

requests.HTTPError – If there is no such commit.

getCommitsIterable(toDate: datetime, numberOfCommits)[source]

Returns a generator to be used in a for loop, that returns a certain number of commits up to a certain date.

Parameters
  • toDate (datetime.datetime) – retrieve only commits before this date.

  • numberOfCommits – retrive this number of commit, if possible.

getPullByNumber(number)[source]

Returns a RepoRetriveal.PullRequest instance from its number.

Parameters

number (int) – The number of the pull request.

Raises

requests.HTTPError – If there is no pull with this number.

getPullFiles(pull: PullRequest)[source]

Returns a list of RepoRetriveal.RepoFile associated with a pull request.

Parameters

pull (RepoRetriveal.PullRequest) – the pull request.

Raises

requests.HTTPError – If there is no such pull.

getPullIterable(toNumber, numberOfPulls)[source]

Returns a generator to be used in a for loop, that returns a certain number of pull request up to a certain number.

Parameters
  • toNumber (int) – retrieve only pulls before and not including this number.

  • numberOfPulls – retrive this number of pulls, if possible.

review_recommender.inverted_files module

class review_recommender.inverted_files.InvertedFile[source]

Bases: object

A class that models a basic inverted file, to witch you can add key value pairs of tokens to their count associated with an item (could be a document or anything else, the only condition is that it is hashable). Can be queried to get similar items based on cosine similarity.

add(item, token_freq: dict[str, int])[source]

Add an item with its associated token frequencies:

Parameters

token_freq (dict[str, int]) – pairs of token and their count

getSimilar(tokenFreqs)[source]

Returns a list of items that are similar to a query, that is a new dictionary of token with their count.

Parameters

tokenFreqs (dict[str, int]) – pairs of token and their count

review_recommender.ranker module

review_recommender.ranker.getRanking(repo: RepoRetriveal, pullNumber, numberOfPulls=30, numberOfCommits=30)[source]

Given a repository and a pull number, returns a rank of possible revisors:

Parameters
  • repo (data_retriveal.RepoRetriveal) – the repository

  • pullNumber (int) – the number of the pull

  • numberOfPulls (int) – number of pulls you want to retrieve, defaults to 30

  • numberOfCommits (int) – number of commits you want to retrieve, defaults to 30

Raises

requests.HTTPError – the repository or the pull number is invalid.

review_recommender.scorer module

class review_recommender.scorer.Scorer[source]

Bases: object

A simple scoreboard for reviewers.

addReviewerScore(reviewer, score)[source]

Add a score to a reviewer.

Parameters
  • reviewer (string) –

  • score (float) –

getSorted()[source]

Returns a dictionary sorted by decreasing order of scores.

Returns

reviewer-score pairs.

Return type

dict[string, float]

prettyFormat()[source]

Returns a formatted string of the scores in percentage.

review_recommender.tokenizer module

class review_recommender.tokenizer.Tokenizer[source]

Bases: object

A simple tokenizer for source files, that collects import keywords from python, java and c++ code.

static getTokenFreqs(file_list: list[RepoRetriveal.RepoFile])[source]

Given a list of files in a repository, returns the frequency of the import keywords in such files, if they are source files of the supported languages. Adds to the tokens also the tokens in the file path.

Parameters

file_list (list[data_retriveal.RepoRetriveal.RepoFile]) – the list of files

Returns

A dictionary of token-frequency pairs.