Garage Cinema Research
UNIVERSITY OF CALIFORNIA AT BERKELEY SCHOOL OF INFORMATION
WHAT'S NEW
About Us
Garage Cinema Research and UC Berkeley partner with Yahoo! Inc. to create Yahoo! Research Berkeley.
Research
Garage Cinema Research is doing cutting edge research in media metadata, context-aware mobile media applications, automated media capture, automatic media editing, and the social uses of personal media.
Events
Publications
People
Studio
The SIMS Digital Media Studio receives new high-performance web servers as part of our UC Discovery Grant for Digital Media.
Resources
Learn about related researchers, conferences, and journals.
Sponsors
Garage Cinema Research works with leading sponsors in industry and academia.
 
RESEARCH
 Garage Cinema Research Overview

Garage Cinema Research is a research group lead by Professor Marc Davis at UC Berkeley's School of Information Management and Systems (SIMS) that is focused on creating the technology and applications that will enable daily media consumers to become daily media producers. Garage Cinema Research's work encompasses the theory, design, and development of digital media systems for creating and using media metadata to automate media production, sharing, and reuse. Prof. Davis and his students in Garage Cinema Research are working on: Mobile Media Metadata (context-aware mobile media technology and applications that leverage contextual metadata-spatial, temporal and social-to infer media content and support media sharing and reuse); the Social Uses of Personal Media (social science and design research to learn how and why people use digital imaging in order to support the design of next generation mobile media applications); Media Streams Metadata Exchange (media metadata framework for annotating, retrieving, sharing, and remixing media on the Web); Active Capture (interactive cameras that use signal processing and computer-human interaction to capture high quality, reusable, annotated media assets); and Adaptive Media (adaptive media templates and automatic editing functions to mass customize and personalize media). Working together, these research projects and their related technologies will radically simplify, decentralize, and personalize media production, sharing, and reuse, bringing about a "Garage Cinema" revolution in which people use computational media to communicate with each other every day.

 Garage Cinema Research Projects

Mobile Media Metadata (MMM)

Mobile phones with media creation capabilities are rapidly entering the marketplace in the USA and already have significant market presence in Asia and Europe. In 2003, there were more cameraphones sold worldwide than digital cameras. cameraphones will bring about a revolution in consumer imaging because they are not only networked, but programmable. Software developers can write applications for these mobile imaging devices, and software can fundamentally change the imaging experience at the point of image capture. The opportunity is to create software solutions for cameraphones that can address long standing challenges in consumer media creation, sharing, management, and reuse in a fundamentally new way. We can do this by leveraging the spatio-temporal context and social community of media capture and use (when, where, and by whom media is captured, shared, and used) to infer media content, context, and community and thereby help automate media annotation, retrieval, sharing, and reuse. As a result of this approach, we believe we will solve a fundamental problem in consumer adoption of mobile media services: the need to have content-based access to the media consumers capture on their mobile devices.

We have conducted fairly large scale deployments and user testing of our MMM prototypes with 60 users using MMM1 on the Nokia 3650 cameraphone in 2003-2004 and 60 users using MMM2 on the Nokia 7610 cameraphone in 2004-2005. Our SIMS graduate student users in IS202 Information Organization and Retrieval have also worked in project teams to develop numerous innovative mobile media application concepts based on MMM1 and MMM2.

Mobile Media Metadata 2 (MMM2)

In our MMM research, we leverage regularities in media and metadata created by communities of users that share common spatial, temporal, and social contexts to make inferences about the content, context, and community of media captured on mobile devices (especially cameraphones). MMM2 uses "context-to-community" inferencing to support the sharing of mobile media by inferring the likely recipients for media captured on cameraphones based on the user's and the community's prior sharing history and contextual metadata such as the time of capture, CellID and GPS location, and Bluetooth-sensed human co-presence. We have seen a 2189% increase in the number of photos uploaded per user per day in MMM2 (1.31) compared to MMM1 (0.06) which seems due to several factors: better image quality (VGA vs. 1 megapixel image resolution, "night mode" for low light, and digital zoom) in the Nokia 7610 vs. the Nokia 3650; familiarity of the user population with cameraphones (12 prior cameraphone users in 2004 vs. only 1 in 2003); the availability of only one, rather than two camera applications in MMM2 vs. MMM1; automatic background upload of photos to the MMM2 web photo management application; and automatic suggestion of sharing recipients on the cameraphone and in the web application. Our qualitative and quantitative studies have shown that MMM2 users are pleased with the share guesser's ability to suggest sharing recipients based on prior sharing history and contextual metadata and share on average 26% of the photos they capture and upload with MMM2.

Mobile Media Metadata 1 (MMM1)

The devices and usage contexts of personal digital photography are undergoing rapid transformation from the traditional camera-to-desktop-to-network image pipeline to an integrated mobile imaging experience. The ascendancy of mobile media capture devices (especially cameraphones) makes possible a significant new paradigm for digital imaging because, unlike traditional digital cameras, cameraphones integrate multimedia capture, programmable processing, wireless networking, rich user interaction capabilities, personal information management functions, and automatic contextual metadata all in one device that users carry with them almost all the time. Our first Mobile Media Metadata prototype (MMM1) leverages the spatio-temporal context and the social community of media capture to infer media content. In our approach we:

  • Gather all automatically available information at the point of capture (time, spatial location, phone user, etc.)
  • Use metadata similarity algorithms to find similar media that has been annotated before
  • Take advantage of this previously annotated media to make educated guesses about the content of the newly captured media
  • Interact in a simple and intuitive way with the phone user to confirm and augment system-supplied metadata for captured media

Using this approach, MMM1 guessed the correct location of the subject of the photo (out of an average of 36.8 possible locations) 100% of the time within the first four guesses, 96% of the time within the first three guesses, 88% of the time within the first two guesses, and 69% of the time as the first guess.

This sister project lead by Prof. Nancy Van House is investigating a central problem for technology design: predicting users and uses for emerging technologies, i.e., doing user-centered design for users and uses that don't yet exist. This is especially true in the case of mobile media technology and applications, in particular cameraphones, which are undergoing rapid growth and transformation. Designers of mobile media technology and applications in industry and academia need new methods to project and design for future uses and users of mobile media. We use the term "social uses" to describe the higher level motives that guide the specific actions that users perform. For example, while we may observe that a user performs the action of emailing a photo to family members, this action (i.e., "what" the user does) is not the same as the motive informing the action (i.e., "why" the user does it), in this case to maintain the social relationship. Our social science research has uncovered several significant social uses of personal imaging technology which designers of imaging and mobile media technology need to understand and design for: constructing personal and group memory; creating and maintaining social relationships; self-expression; self-presentation, and functional uses for oneself and others.. These social uses and the associated findings from our social science research have significant implications for mobile media technology design and inform our development of design methods aimed at projecting and designing for future uses and users of mobile media technology.

Garage Cinema Research is building on Professor Davis' Media Streams, an iconic visual language and system for media annotation, retrieval, and resequencing according to semantic descriptions of media content using manual, semi-automatic, and automatic techniques. The MSMDX project's goal is to create a platform for collaboratively annotating, retrieving, sharing, and remixing multimedia content on the World Wide Web. This platform will be used to discover whether the power of distributed social networks together with semantic web technology can be exploited to solve the problem of how to generate useful machine-readable descriptions of multimedia content. The usefulness of the descriptions produced will be evaluated by building innovative media services that rely on them.


Active Capture

Actve Capture software and interaction design automate the capture of stills and video for, and of, users. By integrating capture, processing, and interaction, Garage Cinema Research's Active Capture approach automates the traditional processes of direction and cinematography. Using real-time media analysis in an interactive control loop, Active Capture software structures the user's interaction with a capture device to record reusable, annotated media assets. Garage Cinema Research is researching and developing a set of consumer capture scenarios that support media personalization and reuse as well as design methods and tools for creating Active Capture applications. The captured media assets are automatically annotated for later access and reuse in a variety of applications from Visual IDs to personalized video communications, marketing, and entertainment.


Adaptive Media

Garage Cinema Research is researching and developing software for the mass customization and personalization of media by structuring media assets into Adaptive Media Templates (AMTs). AMTs encode media assets in such a way that they can co-adapt input media assets and compute a unique customized and/or personalized result. Garage Cinema Research has systematically automated several of the main functions of cinematic editing, including: reframing and repositioning of images and video (especially of people); audio-video synchronization; cutting on motion; 1-shot/2-shot/cutaway editing; audio Foley; a variety of parametric special effects; and basic editing operations such as keying, compositing, and sequencing. Garage Cinema Research's automatic editing functions render high quality personalized and customized media in seconds on consumer level platforms that would take skilled operators on expensive hardware hours to produce. Garage Cinema Research is extending its work in Adaptive Media Templates to the development of media components that understand their contents and the principles of their (re)combination.
Home | About Us | Research | People | Publications | Events | SIMS Studio | Resources | Sponsors | Contact Us