Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
This blog post is a collaboration and is hosted on the AWS Machine Learning Blog.
Published:
In this research project, we show that using a deep learning based DNA language model we can achieve performance in sequence classification approaching the performance of standard mapping algorithms. The performance of our model was most comparable to the performance of mapping when the input sequences were mutated at a higher rate (0.1 snps per bp).
Published in medRxiv, 2021
The main contributions of this paper include a practical one-stage group testing protocol guided by maximizing pool entropy and a maximum-likelihood recovery algorithm under the probabilistic framework.
Recommended citation: Liu, Y., Kadyan, S., Pe’er, I. (2021). A Recovery Algorithm and Pooling Designs for One-Stage Noisy Group Testing Under the Probabilistic Framework. medRxiv 2021.03.09.21253193; doi: https://doi.org/10.1101/2021.03.09.21253193 https://www.medrxiv.org/content/10.1101/2021.03.09.21253193v1
Published in International Conference on Algorithms for Computational Biology, 2021
The main contributions of this paper include a practical one-stage group testing protocol guided by maximizing pool entropy and a maximum-likelihood recovery algorithm under the probabilistic framework.
Recommended citation: Liu, Y., Kadyan, S., Pe’er, I. (2021). A Recovery Algorithm and Pooling Designs for One-Stage Noisy Group Testing Under the Probabilistic Framework. In: Martín-Vide, C., Vega-Rodríguez, M.A., Wheeler, T. (eds) Algorithms for Computational Biology. AlCoB 2021. Lecture Notes in Computer Science(), vol 12715. Springer, Cham. https://link.springer.com/chapter/10.1007/978-3-030-74432-8_4
Published in bioRxiv, 2022
Here we report OpenFold, a fast, memory-efficient, and trainable implementation of AlphaFold2, and OpenProteinSet, the largest public database of protein multiple sequence alignments. We use OpenProteinSet to train OpenFold from scratch, fully matching the accuracy of AlphaFold2. Having established parity, we assess OpenFold's capacity to generalize across fold space by retraining it using carefully designed datasets.
Recommended citation: OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Gustaf Ahdritz, Nazim Bouatta, Sachin Kadyan, Qinghui Xia, William Gerecke, Timothy J O’Donnell, Daniel Berenberg, Ian Fisk, Niccolò Zanichelli, Bo Zhang, Arkadiusz Nowaczynski, Bei Wang, Marta M Stepniewska-Dziubinska, Shang Zhang, Adegoke Ojewole, Murat Efe Guney, Stella Biderman, Andrew M Watkins, Stephen Ra, Pablo Ribalta Lorenzo, Lucas Nivon, Brian Weitzner, Yih-En Andrew Ban, Peter K Sorger, Emad Mostaque, Zhao Zhang, Richard Bonneau, Mohammed AlQuraishi; bioRxiv 2022.11.20.517210; doi: https://doi.org/10.1101/2022.11.20.517210 https://www.biorxiv.org/content/10.1101/2022.11.20.517210v2
Published:
As part of the Machine Learning on AWS for Life Sciences talk at AWS re:Invent 2022, I talked about the growing importance of computational power and techniques in the advancement of biology.
Graduate Course, Department of Computer Science, Columbia University, 2021
I was a Teaching Assistant and Instructor for the Natural Language Processing course taught by Prof. Yassine Benajiba during the Fall 2021 semester.