Publications
"Resource Efficienct Computing for Warehouse-scale Datacenters",
Conference on Design Automation and Test in Europe (DATE), Grenoble, France, 03/2013.
Download: paper (332.62 KB)
"Overcoming the limitations of conventional vector processors",
Proceedings of the 30th Annual International Symposium on Computer Architecture (ISCA), San Diego, CA, pp. 399–409, 06/2003.
"Scalable Vector Processors for Embedded Systems",
IEEE Micro, vol. 23, no. 6, pp. 36–45, 11/2003.
"Server Engineering Insights for Large-Scale Online Services",
IEEE Micro, vol. 30, no. 4, Los Alamitos, CA, USA, IEEE Computer Society Press, pp. 8–19, 2010.
Download: paper (496.53 KB)
"Vector vs. Superscalar and VLIW Architectures for Embedded Multimedia Benchmarks",
Proceedings of the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO), Istanbul, Turkey, pp. 283–293, 11/2002.
"The Stream Virtual Machine",
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 267–277, 9/2004.
"Reconciling High Server Utilization and Sub-millisecond Quality-of-Service",
Proceedings of the 2014 EuroSys Conference, Amsterdam, Netherlands, 04/2014.
"Dynamic Management of TurboMode in Modern Multi-core Chips",
20th Intl. Symposium on High Performance Computer Architecture (HPCA), Orlando, FL, 02/2014.
Download: paper (738.87 KB); slides (1.71 MB)
"Towards Energy Proportionality for Large-Scale Latency-Critical Workloads",
International Symposium on Computer Architecture, Minneapolis, Minnesota, 06/2014.
Download: Paper (897.16 KB); Slides (6.82 MB)
"Heracles: Improving Resource Efficiency at Scale",
International Symposium on Computer Architecture, Portland, Oregon, 06/2015.
Download: Paper (792.18 KB)
"Towards Energy-proportional Datacenter Memory with Mobile DRAM",
Proceedings of the 39th Annual International Symposium on Computer Architecture, Washington, DC, USA, IEEE Computer Society, pp. 37–48, 2012.
Download: paper (5.08 MB)
"Evaluating Bufferless Flow Control for On-Chip Networks",
Proceedings of the 4th ACM/IEEE international symposium on Networks-on-Chip (NOCS-2010), 05/2010.
Download: paper (158.51 KB); slides (897.85 KB)
"ATLAS: a chip-multiprocessor with transactional memory support",
Proceedings of the conference on Design, automation and test in Europe, San Jose, CA, USA, EDA Consortium, pp. 3–8, 2007.
Download: atlas_date_07.pdf (736.86 KB)
"FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures.",
FCCM: IEEE Computer Society, pp. 221-228, 2010.
Download: paper (1.05 MB)
"Generating Configurable Hardware from Parallel Patterns",
Twenty First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Atlanta, GA, 04/2016.
Abstract
Download: paper (582.67 KB)
"Plasticine: A Reconfigurable Architecture For Parallel Patterns",
ISCA '17: 44th International Symposium on Computer Architecture, Toronto, Canada, 06/2017.
Abstract
Download: paper (1.53 MB)
"Convolution Engine: Balancing Efficiency &\#38; Flexibility in Specialized Computing",
Proceedings of the 40th Annual International Symposium on Computer Architecture, New York, NY, USA, ACM, pp. 24–35, 2013.
Download: paper (4.78 MB)
"Scalable and Efficient Fine-Grained Cache Partitioning with Vantage",
IEEE Micro's Top Picks from the Computer Architecture Conferences, vol. 32, no. 3, May-June, 2012.
Download: paper (1.05 MB)
"ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems",
Proceedings of the 40th annual International Symposium in Computer Architecture (ISCA-40), Tel-Aviv, Israel, 06/2013.
Download: paper (454.61 KB)
"Vantage: Scalable and Efficient Fine-Grain Cache Partitioning",
International Symposium on Computer Architecture (ISCA), San Jose, CA, 06/2011.
Download: paper (753.6 KB); slides (1.74 MB)
"Flexible Architectural Support for Fine-Grain Scheduling",
Proceedings of the 15th international conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XV), 03/2010.
Download: paper (433.69 KB); slides (354.52 KB)
"The ZCache: Decoupling Ways and Associativity",
Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'43), Atlanta, GE, 12/2010.
Download: paper (276.4 KB); slides (752.04 KB)
"An Analysis of Interconnection Networks for Large Scale Chip-Multiprocessors",
ACM Transactions on Architecture and Code Optimization (TACO), vol. 7, no. 1, 04/2010.
Download: paper (1.7 MB)
"SCD: A Scalable Coherence Directory with Flexible Sharer Set Encoding",
Proceedings of the 18th international symposium on High Performance Computer Architecture (HPCA-18), New Orleans, LA, 02/2012.
Download: paper (424.5 KB); slides (418.71 KB)
"Dynamic Fine-Grain Scheduling of Pipeline Parallelism",
Proceedings of the 20th Intl. Conference on Parallel Architecture and Compilation Techniques (PACT), Galveston Island, TX, 10/2011.
Download: PDF (336.92 KB)