Rapid lossless compression of short text messages

Kenan Kalajdzic, Samaher Hussein Ali, Ahmed Patel

    Research output: Contribution to journalArticle

    13 Citations (Scopus)

    Abstract

    In this paper we present a new algorithm called b64pack1 for compression of very short text messages. The algorithm executes in two phases: in the first phase, it converts the input text consisting of letters, numbers, spaces and punctuation marks commonly used in English writings to a format which can be compressed in the second phase. The second phase consists of a transformation which reduces the size of the message by a fixed fraction of its original size. We experimentally measured both the compression speed and the compression ratio of b64pack on a large number of short messages and compared them with compress, gzip and bzip2, three most common UNIX compression programs. We show that in case of short text messages up to a certain size b64pack achieves better compression than any of the three programs. With respect to speed, b64pack beats all three algorithms by orders of magnitudes. This rapid compression is one of the key strengths of b64pack.

    Original languageEnglish
    Pages (from-to)53-59
    Number of pages7
    JournalComputer Standards and Interfaces
    Volume37
    DOIs
    Publication statusPublished - 1 Jan 2015

    Fingerprint

    UNIX

    Keywords

    • Data compression
    • Lossless compression
    • Short text messages
    • SMS

    ASJC Scopus subject areas

    • Software
    • Hardware and Architecture
    • Law

    Cite this

    Rapid lossless compression of short text messages. / Kalajdzic, Kenan; Ali, Samaher Hussein; Patel, Ahmed.

    In: Computer Standards and Interfaces, Vol. 37, 01.01.2015, p. 53-59.

    Research output: Contribution to journalArticle

    Kalajdzic, Kenan ; Ali, Samaher Hussein ; Patel, Ahmed. / Rapid lossless compression of short text messages. In: Computer Standards and Interfaces. 2015 ; Vol. 37. pp. 53-59.
    @article{e95ba16f99d84c1da96b1f3a470b56da,
    title = "Rapid lossless compression of short text messages",
    abstract = "In this paper we present a new algorithm called b64pack1 for compression of very short text messages. The algorithm executes in two phases: in the first phase, it converts the input text consisting of letters, numbers, spaces and punctuation marks commonly used in English writings to a format which can be compressed in the second phase. The second phase consists of a transformation which reduces the size of the message by a fixed fraction of its original size. We experimentally measured both the compression speed and the compression ratio of b64pack on a large number of short messages and compared them with compress, gzip and bzip2, three most common UNIX compression programs. We show that in case of short text messages up to a certain size b64pack achieves better compression than any of the three programs. With respect to speed, b64pack beats all three algorithms by orders of magnitudes. This rapid compression is one of the key strengths of b64pack.",
    keywords = "Data compression, Lossless compression, Short text messages, SMS",
    author = "Kenan Kalajdzic and Ali, {Samaher Hussein} and Ahmed Patel",
    year = "2015",
    month = "1",
    day = "1",
    doi = "10.1016/j.csi.2014.05.005",
    language = "English",
    volume = "37",
    pages = "53--59",
    journal = "Computer Standards and Interfaces",
    issn = "0920-5489",
    publisher = "Elsevier",

    }

    TY - JOUR

    T1 - Rapid lossless compression of short text messages

    AU - Kalajdzic, Kenan

    AU - Ali, Samaher Hussein

    AU - Patel, Ahmed

    PY - 2015/1/1

    Y1 - 2015/1/1

    N2 - In this paper we present a new algorithm called b64pack1 for compression of very short text messages. The algorithm executes in two phases: in the first phase, it converts the input text consisting of letters, numbers, spaces and punctuation marks commonly used in English writings to a format which can be compressed in the second phase. The second phase consists of a transformation which reduces the size of the message by a fixed fraction of its original size. We experimentally measured both the compression speed and the compression ratio of b64pack on a large number of short messages and compared them with compress, gzip and bzip2, three most common UNIX compression programs. We show that in case of short text messages up to a certain size b64pack achieves better compression than any of the three programs. With respect to speed, b64pack beats all three algorithms by orders of magnitudes. This rapid compression is one of the key strengths of b64pack.

    AB - In this paper we present a new algorithm called b64pack1 for compression of very short text messages. The algorithm executes in two phases: in the first phase, it converts the input text consisting of letters, numbers, spaces and punctuation marks commonly used in English writings to a format which can be compressed in the second phase. The second phase consists of a transformation which reduces the size of the message by a fixed fraction of its original size. We experimentally measured both the compression speed and the compression ratio of b64pack on a large number of short messages and compared them with compress, gzip and bzip2, three most common UNIX compression programs. We show that in case of short text messages up to a certain size b64pack achieves better compression than any of the three programs. With respect to speed, b64pack beats all three algorithms by orders of magnitudes. This rapid compression is one of the key strengths of b64pack.

    KW - Data compression

    KW - Lossless compression

    KW - Short text messages

    KW - SMS

    UR - http://www.scopus.com/inward/record.url?scp=84908659757&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84908659757&partnerID=8YFLogxK

    U2 - 10.1016/j.csi.2014.05.005

    DO - 10.1016/j.csi.2014.05.005

    M3 - Article

    AN - SCOPUS:84908659757

    VL - 37

    SP - 53

    EP - 59

    JO - Computer Standards and Interfaces

    JF - Computer Standards and Interfaces

    SN - 0920-5489

    ER -