Computer Science Faculty Research & Creative Works

Exploring Optimization and Caching for Efficient Collection Operations

Venkata Krishna Suhas Nerella
Swetha Surapaneni
Sanjay K. Madria, Missouri University of Science and TechnologyFollow
Thomas Weigert

Abstract

Many large programs operate on collection types. Extensive libraries are available in many programming languages, such as the C++ Standard Template Library, which make programming with collections convenient. Extending programming languages to provide collection queries as first-class constructs in the language would not only allow programmers to write queries explicitly in their programs but it would also allow compilers to leverage the wealth of experience available from the database domain to optimize such queries. This paper describes an approach to reduce the run time of programs involving explicit collection queries by performing run time query optimization that is effective for single runs of a program. In addition, it also leverages a cache to store previously computed results. The proposed approach relies on histograms built from the data at run time to estimate the selectivity of joins and predicates in order to construct query plans. Information from earlier executions of the same query during run time is leveraged during the construction of the query plans, even when the data has changed between these executions. An effective cache policy is also determined for caching the results of join (sub) queries. The cache is maintained incrementally, when the underlying collections change, and use of the cache space is optimized by a cache replacement policy. Our approach has been implemented within the Java Query Language (JQL) framework using Aspect J. Our approach demonstrated that its run time query optimization in integration with caching sub query result significantly improves the run time of programs with explicit queries over equivalent programs performing collection operations by iterating over those collections. This paper evaluates our approach using synthetic as well as real world Robocode programs by comparing it to JQL as a benchmark. Experimental results show that our approach performs better than the JQL approach with respect to the program run time. © Springer Science + Business Media New York 2013.

Recommended Citation

V. K. Nerella et al., "Exploring Optimization and Caching for Efficient Collection Operations," Automated Software Engineering, vol. 21, no. 1, pp. 3 - 40, Springer, Jan 2014.

The definitive version is available at https://doi.org/10.1007/s10515-013-0119-x

Department(s)

Computer Science

Keywords and Phrases

Cache policy; Collection operations; Join caching; Joins; Query optimization; Run time; Selectivity

International Standard Serial Number (ISSN)

1573-7535; 0928-8910

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jan 2014

Download

Full Text Link

Included in

Computer Sciences Commons

COinS

Computer Science Faculty Research & Creative Works

Exploring Optimization and Caching for Efficient Collection Operations

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

Exploring Optimization and Caching for Efficient Collection Operations

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations