review_recommender package
Submodules
review_recommender.data_retriveal module
- class review_recommender.data_retriveal.RepoRetriveal(owner, repo, token=None)[source]
Bases:
objectClass to download pulls and commit from a GitHub repository.
- class Commit(sha: str, author_login: str, filesInfo: list, date: datetime)[source]
Bases:
objectA commit in a repository identified by its sha. Can acces login name of the author and date.
- class PullRequest(number: int, author_login: str, reviewers: set[str], date: datetime)[source]
Bases:
objectA pull request in a repository identified by its number. Can access login name of the author, set of reviewers and date.
- class RepoFile(filepath: str, content: str)[source]
Bases:
objectA text file in a repository. Can access its path and its content.
- getCommitBySha(sha)[source]
Returns a RepoRetriveal.Commit instance from its sha.
- Parameters
sha (string) – The sha of the commit.
- Raises
requests.HTTPError – If there is no pull with this sha.
- getCommitFiles(commit: Commit)[source]
Returns a list of RepoRetriveal.RepoFile associated with a commit.
- Parameters
pull (RepoRetriveal.Commit) – the commit.
- Raises
requests.HTTPError – If there is no such commit.
- getCommitsIterable(toDate: datetime, numberOfCommits)[source]
Returns a generator to be used in a for loop, that returns a certain number of commits up to a certain date.
- Parameters
toDate (datetime.datetime) – retrieve only commits before this date.
numberOfCommits – retrive this number of commit, if possible.
- getPullByNumber(number)[source]
Returns a RepoRetriveal.PullRequest instance from its number.
- Parameters
number (int) – The number of the pull request.
- Raises
requests.HTTPError – If there is no pull with this number.
- getPullFiles(pull: PullRequest)[source]
Returns a list of RepoRetriveal.RepoFile associated with a pull request.
- Parameters
pull (RepoRetriveal.PullRequest) – the pull request.
- Raises
requests.HTTPError – If there is no such pull.
- getPullIterable(toNumber, numberOfPulls)[source]
Returns a generator to be used in a for loop, that returns a certain number of pull request up to a certain number.
- Parameters
toNumber (int) – retrieve only pulls before and not including this number.
numberOfPulls – retrive this number of pulls, if possible.
review_recommender.inverted_files module
- class review_recommender.inverted_files.InvertedFile[source]
Bases:
objectA class that models a basic inverted file, to witch you can add key value pairs of tokens to their count associated with an item (could be a document or anything else, the only condition is that it is hashable). Can be queried to get similar items based on cosine similarity.
review_recommender.ranker module
- review_recommender.ranker.getRanking(repo: RepoRetriveal, pullNumber, numberOfPulls=30, numberOfCommits=30)[source]
Given a repository and a pull number, returns a rank of possible revisors:
- Parameters
repo (data_retriveal.RepoRetriveal) – the repository
pullNumber (int) – the number of the pull
numberOfPulls (int) – number of pulls you want to retrieve, defaults to 30
numberOfCommits (int) – number of commits you want to retrieve, defaults to 30
- Raises
requests.HTTPError – the repository or the pull number is invalid.
review_recommender.scorer module
- class review_recommender.scorer.Scorer[source]
Bases:
objectA simple scoreboard for reviewers.
- addReviewerScore(reviewer, score)[source]
Add a score to a reviewer.
- Parameters
reviewer (string) –
score (float) –
review_recommender.tokenizer module
- class review_recommender.tokenizer.Tokenizer[source]
Bases:
objectA simple tokenizer for source files, that collects import keywords from python, java and c++ code.
- static getTokenFreqs(file_list: list[RepoRetriveal.RepoFile])[source]
Given a list of files in a repository, returns the frequency of the import keywords in such files, if they are source files of the supported languages. Adds to the tokens also the tokens in the file path.
- Parameters
file_list (list[data_retriveal.RepoRetriveal.RepoFile]) – the list of files
- Returns
A dictionary of token-frequency pairs.