Awesome
pureseqtmr
Branch | ||
---|---|---|
master | ||
develop |
R package to work with PureseqTM.
Because PureseqTM only works on Linux, pureseqtmr
only works on Linux.
Example
library(pureseqtmr)
# Use an example proteome
fasta_filename <- system.file(
"extdata",
"UP000464024.fasta",
package = "pureseqtmr"
)
# Predict the topology
topology <- predict_topology(fasta_filename)
# Simplify the protein names
topology$name <- stringr::str_match(
string = topology$name,
pattern = "..\\|.*\\|(.*)_SARS2"
)[,2]
# Plot the topology
plot_topology(topology)
Install PureseqTM
Due to CRAN policy, pureseqtmr
cannot install PureseqTM.
As a workaround, the non-CRAN
pureseqtmrinstall
can be used.
To install PureseqTM:
remotes::install_github("richelbilderbeek/pureseqtmrinstall")
pureseqtmrinstall::install_pureseqtm()
How to convert a FASTA file to a PureseqTM topolgy file?
Combine predict_topology
and save_tibble_as_fasta_file
:
library(pureseqtmr)
save_tibble_as_fasta_file(
predict_topology(fasta_filename = "my.fasta"),
fasta_filename = "my.topo"
)
Are there similar packages?
These are the ones I use:
- tmhmm:
predict membrane protein topology at three levels (inside, outside, membrane),
do need registration form, as it is for academic users only. See here for a run-time speed comparison. Spoiler: PureseqTM is approximately 20x slower than TMHMM
What happens to a selenocystein (U
) in the sequence?
pureseqtmr
gives an error.
PureseqTM ignores the selenocystein, as it does with any character.
What happens to short sequences?
pureseqtmr
package gives an error.
PureseqTM prints out a message and returns a zero-length topology.
There is a feature I miss
See CONTRIBUTING, at Submitting use cases
I want to collaborate
See CONTRIBUTING, at 'Submitting code'
I think I have found a bug
See CONTRIBUTING, at 'Submitting bugs'
There's something else I want to say
Sure, just add an Issue. Or send an email.
External links
References
Article about PureseqTM
:
- Wang, Qing, et al. "Efficient and accurate prediction of transmembrane topology from amino acid sequence only." bioRxiv (2019): 627307.