SantaCoder: don't reach for the stars!
Authors:
Loubna Ben Allal,
Raymond Li,
Denis Kocetkov,
Chenghao Mou,
Christopher Akiki,
Carlos Munoz Ferrandis,
Niklas Muennighoff,
Mayank Mishra,
Alex Gu,
Manan Dey,
Logesh Kumar Umapathi,
Carolyn Jane Anderson,
Yangtian Zi,
Joel Lamy Poirier,
Hailey Schoelkopf,
Sergey Troshin,
Dmitry Abulkhanov,
Manuel Romero,
Michael Lappert,
Francesco De Toni,
Bernardo García del Río,
Qian Liu,
Shamik Bose,
Urvashi Bhattacharyya,
Terry Yue Zhuo
, et al. (16 additional authors not shown)
Abstract:
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigat…
▽ More
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigating better preprocessing methods for the training data. We train 1.1B parameter models on the Java, JavaScript, and Python subsets of The Stack and evaluate them on the MultiPL-E text-to-code benchmark. We find that more aggressive filtering of near-duplicates can further boost performance and, surprisingly, that selecting files from repositories with 5+ GitHub stars deteriorates performance significantly. Our best model outperforms previous open-source multilingual code generation models (InCoder-6.7B and CodeGen-Multi-2.7B) in both left-to-right generation and infilling on the Java, JavaScript, and Python portions of MultiPL-E, despite being a substantially smaller model. All models are released under an OpenRAIL license at https://hf.co/bigcode.
△ Less
Submitted 24 February, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
Performance Evaluation of Delay Tolerant Network in Heterogeneous Highly Dense Mobile Environment
Authors:
R. S. Mangrulkar,
Dr. Mohammad Atique
Abstract:
Delay tolerant network (DTN) is opportunistic network where each node searches best opportunity to deliver the message called bundle to the destination. DTN implements a store and forward message switching system by simply introducing another new protocol layer called the Bundle Layer on top of the transport layer. The bundle layer is responsible for storing and forwarding entire message in messag…
▽ More
Delay tolerant network (DTN) is opportunistic network where each node searches best opportunity to deliver the message called bundle to the destination. DTN implements a store and forward message switching system by simply introducing another new protocol layer called the Bundle Layer on top of the transport layer. The bundle layer is responsible for storing and forwarding entire message in message segments called bundles between source node and destination node. This paper evaluates the performance of delay tolerant network layer in heterogeneous highly dense mobile node environment. The heterogeneous network is created with the help of stationary wired node and Base Station node by introducing dynamic dense Mobile node network. Mobile nodes are assigned with continuous mobility. Three parameters are suggested $Δ$, $Θ$ and $λ$ to correlate the results obtained using rigorous simulation. Results show that after some threshold values, dense feature about mobile node does not pretend the delay cause for delay tolerant network packets. Also, increase in number of mobile node and number of File Transfer connection rarely change the overall performance of the delay tolerant network.
△ Less
Submitted 24 February, 2013;
originally announced February 2013.