Category Archives: Research

Come and work with me: KTP Associate in Big Social Data Analytics

Fancy working with me on a Knowledge Transfer Partnership (KTP) project in collaboration with Coup Media (funded by Innovate UK with support from the Welsh Government)?

A KTP Associate position is available to develop an adaptable social media analytics engine and associated framework for the film and media industry to capture consumer insight, marketing perceptions, sentiments, trends and rankings using big social media datasets. With the explosion of social networking, there is a clear correlation between box office takings and sentiments, opinions and perceptions expressed in the public domain on social media platforms. This project aims to leverage this by developing an extensible and adaptable social media sentiment engine using big social datasets (initially targeting Twitter) to rank movies by opinion, informing industry marketing decisions and providing commercially valuable insight into the public’s emerging movie tastes and selections.

This is an 11 month position, with a pro-rata salary of £21,000. For informal enquiries, please drop me an email: tcrick@cardiffmet.ac.uk; further information and how to apply can be found on jobs.ac.uk and the Cardiff Met website.

Deadline for applications: Friday 19 June.

Tagged , , , , ,

New paper: “Top Tips to Make Your Research Irreproducible”

It is an unfortunate convention of science that research should pretend to be reproducible; we have noticed (and contributed to) a number of manifestos, guides and top tips on how to make research reproducible, but we have seen very little published on how to make research irreproducible.

Irreproducibility is the default setting for all of science, and irreproducible research is particularly common across the computational sciences (for example, here and here). The study of making your work irreproducible without reviewers complaining is a much neglected area; we feel therefore that by encapsulating our top tips on irreproducibility, we will be filling a much-needed gap in the domain literature. By following our tips, you can ensure that if your work is wrong, nobody will be able to check it; if it is correct, you can make everyone else do disproportionately more work than you to build upon it. Our top tips will also help you salve the conscience of certain reviewers still bound by the fussy conventionality of reproducibility, enabling them to enthusiastically recommend acceptance of your irreproducible work. In either case you are the beneficiary.

  1. Think “Big Picture”. People are interested in the science, not the experimental setup, so don’t describe it.
  2. Be abstract. Pseudo-code is a great way of communicating ideas quickly and clearly while giving readers no chance to understand the subtle implementation details that actually make it work.
  3. Short and sweet. Any limitations of your methods or proofs will be obvious to the careful reader, so there is no need to waste space on making them explicit.
  4. The deficit model. You’re the expert in the domain, only you can define what algorithms and data to run experiments with.
  5. Don’t share. Doing so only makes it easier for other people to scoop your research ideas, understand how your code actually works instead of why you say it does, or worst of all to understand that your code doesn’t work at all.

Read the full version of our high-impact paper on arXiv.

Tagged , , , ,

2015 EAPLS Board member elections

EAPLS, the European Association for Programming Languages and Systems, aims to stimulate research in the area of programming languages and systems. Formally inaugurated in 1996, it provides a forum for researchers across the domain, working with related organisations and industry to initiate scientific events and stimulate the exchange of ideas, as well as raising funds, organising conferences and divesting financial support.

I’m standing in the 2015 EAPLS Board elections (current Board members); I believe there is a significant opportunity to rejuvenate the activities of EAPLS and raise its profile: building networks for early career researchers, sponsoring new events/initiatives, engaging with the major conferences and journals in our field, encouraging improved knowledge transfer activities with industry, as well as raising the profile of the wider research areas in both UK and EU funding streams. We can also be more active in the policy space, by highlighting the educational and economic impact of the wider research areas of programming languages and systems.

You can view my full election statement; all EAPLS members (free to join) are eligible to vote, with the election open until 15 April 2015.

Tagged ,

Paper submitted to CAV 2015: “Dear CAV, We Need to Talk About Reproducibility”

Today, me, Ben Hall (Cambridge) and Samin Ishtiaq (Microsoft Research) submitted a paper to CAV 2015, the 27th International Conference on Computer Aided Verification, to be held in San Francisco in July. CAV is dedicated to the advancement of the theory and practice of computer-aided formal analysis methods for hardware and software systems; the conference covers the spectrum from theoretical results to concrete applications, with an emphasis on practical verification tools and the algorithms and techniques that are needed for their implementation.

In this paper we build upon our recent work, highlighting a number of key issues relating to reproducibility and how they impact on the CAV (and wider computer science) research community, proposing a new model and workflow to encourage, enable and enforce reproducibility in future instances of CAV. We applaud the CAV Artifact Evaluation process, but we need to do more. You can download our arXiv pre-print; the abstract is as follows:

How many times have you tried to re-implement a past CAV tool paper, and failed?

Reliably reproducing published scientific discoveries has been acknowledged as a barrier to scientific progress for some time but there remains only a small subset of software available to support the specific needs of the research community (i.e. beyond generic tools such as source code repositories). In this paper we propose an infrastructure for enabling reproducibility in our community, by automating the build, unit testing and benchmarking of research software.

 
(also see: GitHub repo)

Tagged , , , , ,

The many Rs of e-Research

rscompres

The 6 12 many Rs of e-Research…what else could/should we add to this (especially in the context of research objects and supporting reproducible research)?

Tagged , , , ,

Reproducibility-as-a-service: can the cloud make it real?

Kenji Takeda, Solutions Architect and Technical Manager with Microsoft Research, has written a blog post on Recomputability 2014, as well as discussing some of the issues (and potential opportunities) for reproducibility in computational science we have outlined in our joint paper (including a quote from me):

This is an exciting area of research and one that could have a profound impact on the way that computational science is performed. By rethinking how we develop, use, benchmark, and share algorithms, software, and models, alongside the development of integrated and automated e-infrastructure to support recomputability and reproducibility, we will be able to improve the efficiency of scientific exploration as well as promoting open and verifiable scientific research.

 
Read Kenji’s full post on the Microsoft Research Connections Blog.

Tagged , , , , , ,

It’s impossible to conduct research without software

No one knows how much software is used in research. Look around any lab and you’ll see software — both standard and bespoke — being used by all disciplines and seniorities of researchers. Software is clearly fundamental to research, but we can’t prove this without evidence. And this lack of evidence is the reason why we ran a survey of researchers at 15 Russell Group universities to find out about their software use and background.

 
The Software Sustainability Institute‘s recent survey of researchers at research-intensive UK universities is out. Headlines figures:

  • 92% of academics use research software;
  • 69% say that their research would not be practical without it;
  • 56% develop their own software (worryingly, 21% have no training in software development);
  • 70% of male researchers develop their own software, and only 30% of female researchers do.

For the full story, see the SSI blog post; the survey results described there are based on the responses of 417 researchers selected at random from 15 Russell Group universities, with good representation from across the disciplines, genders and career grades. It represents a statistically significant number of responses that can be used to represent, at the very least, the views of people in research-intensive universities in the UK (the data collected from the survey is available for download and is licensed under a Creative Commons by Attribution licence).

(you may also like to sign this petition and join the UK Community of Research Software Engineers)

Tagged , ,

Accepted papers and programme for Recomputability 2014

I am co-chairing Recomputability 2014 next week, an affiliated workshop of the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2014). The final workshop programme is now available and it will take place on Thursday 11 December in the Hobart Room at the Hilton London Paddington hotel.

I will also be presenting our paper on sharing and publishing scientific models (arXiv), as well as chairing a panel session on the next steps for recomputability and reproducibility; I look forward to sharing some of the outcomes of this workshop over the next few weeks.

The workshop Twitter hashtag is #recomp14; you can also follow the workshop co-chairs: @DrTomCrick and @npch, as well as the main UCC account: @UCC2014_London.

Tagged , , , , ,

Computing research

There is nothing to do with computers that merits a PhD.

Max Newman (1897-1984), as quoted in Alan Turing: The Enigma by Andrew Hodges

 

Tagged , , ,

Come and do a funded PhD with me

Fancy doing a PhD with me at Cardiff Metropolitan University? I have a fully-funded studentship (for UK/EU students) starting in January, in collaboration with HP in Bristol:

The Department of Computing & Information Systems, Cardiff Metropolitan University, is pleased to offer a fully funded PhD Studentship in Provably Optimal Code Generation.

This research project (Scaling Superoptimisation for Enterprise Applications) is part of an on-going strategic collaboration between Cardiff Metropolitan University and Hewlett-Packard in Bristol; HP is a leading technology company that operates in more than 170 countries around the world, providing infrastructure and business offerings that span from handheld devices to some of the world’s most powerful supercomputers.

Applicants must have an excellent first degree in Computer Science, Computer Engineering, Mathematics or a related discipline, with interests/experience at the hardware/software interface and/or in mathematical foundations.

This three year PhD will commence in January 2015. The PhD bursary consists of the standard tuition fee for a Home/EU student (to be £3,760 in 2014/15) and a stipend linked to the minimum amount set annually by Research Councils UK (currently £13,590 p.a.).

Project Context:

Our world is increasingly dependent on the effectiveness and performance of software. Tools and methodologies for creating useful software artefacts have been around for many years, but the scalability of these systems for solving challenging real world problems are — in many important cases — poor. While there are numerous socio-technical issues associated with developing large software systems, there is a significant opportunity to address the optimisation of software in a strategic, adaptable and platform-independent way.

Superoptimisation is an approach to optimising code by aiming for optimality from the outset, rather than as the aggregation of heuristics that are neither intended nor guaranteed to give provable optimality. Building on previous work by Crick et al., this research project will further develop the theoretical foundations of superoptimisation, as well as developing a scalable toolchain for superoptimising enterprise-level software applications.

 
For informal enquiries, please send me an email: tcrick@cardiffmet.ac.uk (but please apply via FindAPhD or here).

Deadline for applications: Friday 31 October.

Tagged , , , , ,

Paper submitted to Recomputability 2014: “Share and Enjoy”: Publishing Useful and Usable Scientific Models

Last month, me, Ben Hall, Samin Ishtiaq and Kenji Takeda (all Microsoft Research) submitted a paper to Recomputability 2014, to be held in conjunction with the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2014) in London in December. This workshop is an interdisciplinary forum for academic and industrial researchers, practitioners and developers to discuss challenges, ideas, policy and practical experience in reproducibility, recomputation, reusability and reliability across utility and cloud computing. It aims to provide an opportunity to share and showcase best practice, as well as to offering a platform to further develop policy, initiatives and practical techniques for researchers in this domain.

In our paper, we discuss a number of issues in this space, proposing a new open platform for the sharing and reuse of scientific models and benchmarks. You can download our arXiv pre-print; the abstract is as follows:

The reproduction and replication of reported scientific results is a hot topic within the academic community. The retraction of numerous studies from a wide range of disciplines, from climate science to bioscience, has drawn the focus of many commentators, but there exists a wider socio-cultural problem that pervades the scientific community. Sharing data and models often requires extra effort, and this is currently seen as a significant overhead that may not be worth the time investment.

Automated systems, which allow easy reproduction of results, offer the potential to incentivise a culture change and drive the adoption of new techniques to improve the efficiency of scientific exploration. In this paper, we discuss the value of improved access and sharing of the two key types of results arising from work done in the computational sciences: models and algorithms. We propose the development of an integrated cloud-based system underpinning computational science, linking together software and data repositories, toolchains, workflows and outputs, providing a seamless automated infrastructure for the verification and validation of scientific models and in particular, performance benchmarks.

 
(see GitHub repo)

Tagged , , , , , ,

Come and do a (fully-funded) PhD with me

Fancy doing a PhD with me at Cardiff Metropolitan University? I have a fully-funded studentship (for UK/EU students) starting in September, in collaboration with HP in Bristol:

Scaling Superoptimisation for Enterprise Applications

Our world is increasingly dependent on the effectiveness and performance of software. Tools and methodologies for creating useful software artefacts have been around for many years, but the scalability of these systems for solving challenging real world problems are — in many important cases — poor. While there are numerous socio-technical issues associated with developing large software systems, there is a significant opportunity to address the optimisation of software in a strategic, adaptable and platform-independent way.

Superoptimisation is an approach to optimising code by aiming for optimality from the outset, rather than as the aggregation of heuristics that are neither intended nor guaranteed to give provable optimality. Building on previous work by Crick et al., this research project will further develop the theoretical foundations of superoptimisation, as well as developing a scalable toolchain for superoptimising enterprise-level industrial software applications. This research project is a collaboration between Cardiff Metropolitan University and Hewlett-Packard (HP) in Bristol; HP is a leading technology company that operates in more than 170 countries around the world, providing infrastructure and business offerings that span from handheld devices to some of the world’s most powerful supercomputers.

Applicants must have an excellent first degree in Computer Science, Computer Engineering, Electronics or a related discipline, with interests/experience in compilers, optimisation, logic programming, satisfiability modulo theories and mathematical foundations.

 
For informal enquiries, send me an email: tcrick@cardiffmet.ac.uk (but please apply via FindAPhD or here).

Deadline for applications: Friday 22 August.

Tagged , , , ,

Paper submitted to WSSSPE2: “Can I Implement Your Algorithm?”: A Model for Reproducible Research Software

Yesterday, me, Ben Hall and Samin Ishtiaq (both Microsoft Research Cambridge) submitted a paper to WSSSPE2, the 2nd Workshop on Sustainable Software for Science: Practice and Experiences to be held in conjunction with SC14 in New Orleans in November. As per the aims of the workshop: progress in scientific research is dependent on the quality and accessibility of software at all levels and it is critical to address challenges related to the development, deployment and maintenance of reusable software as well as education around software practices.

As discussed in our paper, we feel this multitude of research software engineering problems are not just manifest in computer science, but also across the computational science and engineering domains (particularly with regards to benchmarking and availability of code). We highlight a number of recommendations to address these issues, as well as proposing a new open platform for scientific software development. You can download our arXiv pre-print; the abstract is as follows:

The reproduction and replication of novel scientific results has become a major issue for a number of disciplines. In computer science and related disciplines such as systems biology, the issues closely revolve around the ability to implement novel algorithms and approaches. Taking an approach from the literature and applying it in a new codebase frequently requires local knowledge missing from the published manuscripts and project websites. Alongside this issue, benchmarking, and the development of fair, and widely available benchmark sets present another barrier. In this paper, we outline several suggestions to address these issues, driven by specific examples from a range of scientific domains. Finally, based on these suggestions, we propose a new open platform for scientific software development which effectively isolates specific dependencies from the individual researcher and their workstation and allows faster, more powerful sharing of the results of scientific software engineering.

 
(see GitHub repo)

Tagged , , , , , ,

Paper in ACM TOCE: “Restart: The Resurgence of Computer Science in UK Schools”

Further to the previous CAS papers, Neil Brown (University of Kent), Sue Sentance (formerly Anglia Ruskin University, now CAS), Simon Humphreys (CAS/BCS) and I have had a paper accepted into ACM Transactions on Computing Education: Restart: The Resurgence of Computer Science in UK Schools, part of a Special Issue on Computing Education in (K-12) Schools.

The paper will soon be available to download for free via the ACM Author-ize service (or you can download our pre-print); the abstract is as follows:

Computer science in UK schools is undergoing a remarkable transformation. While the changes are not consistent across each of the four devolved nations of the UK (England, Scotland, Wales and Northern Ireland), there are developments in each that are moving the subject to become mandatory for all pupils from age 5 onwards. In this article, we detail how computer science declined in the UK, and the developments that led to its revitalisation: a mixture of industry and interest group lobbying, with a particular focus on the value of the subject to all school pupils, not just those who would study it at degree level. This rapid growth in the subject is not without issues, however: there remain significant forthcoming challenges with its delivery, especially surrounding the issue of training sufficient numbers of teachers. We describe a national network of teaching excellence which is being set up to combat this problem, and look at the other challenges that lie ahead.

 
(see Publications)

Tagged , , , ,

Paper at HCII 2014: “Changing Faces: Identifying Complex Behavioural Profiles”

In June, my colleague Giles Oatley presented a joint paper entitled: Changing Faces: Identifying Complex Behavioural Profiles at HCII 2014, the 16th International Conference on Human-Computer Interaction in Crete.

If you do not have institutional access to SpringerLink, especially the Lecture Notes in Computer Science series, you can download our pre-print. The abstract is as follows:

There has been significant interest in the identification and profiling of insider threats, attracting high-profile policy focus and strategic research funding from governments and funding bodies. Recent examples attracting worldwide attention include the cases of Chelsea Manning, Edward Snowden and the US authorities. The challenges with profiling an individual across a range of activities is that their data footprint will legitimately vary significantly based on time and/or location. The insider threat problem is thus a specific instance of the more general problem of profiling complex behaviours. In this paper, we discuss our preliminary research models relating to profiling complex behaviours and present a set of experiments related to changing roles as viewed through large scale social network datasets, such as Twitter. We employ psycholinguistic metrics in this work, considering changing roles from the standpoint of a trait-based personality theory. We also present further representations, including an alternative psychological theory (not trait-based), and established techniques for crime modelling, spatio-temporal and graph/network, to investigate within a wider reasoning framework.

 
(see Publications)

Tagged , , , , ,

Call for Papers: Recomputability 2014

I am co-chairing Recomputability 2014, the first workshop to focus explicitly on recomputability and reproducibility in the context of utility and cloud computing and is open to all members of the cloud, big data, grid, cluster computing and open science communities. Recomputability 2014 is an affiliated workshop of the 7th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2014), to be held in London in December 2014.

Recomputability 2014 will provide an interdisciplinary forum for academic and industrial researchers, practitioners and developers to discuss challenges, ideas, policy and practical experience in reproducibility, recomputation, reusability and reliability across utility and cloud computing. It will provide an opportunity to share and showcase best practice, as well as to provide a platform to further develop policy, initiatives and practical techniques for researchers in this domain. Participation by early career researchers is strongly encouraged.

Proposed topics of interest include (but are not limited to):

  • infrastructure, tools and environments for recomputabilty and reproducibility in the cloud;
  • recomputability for virtual machines;
  • virtual machines as self-contained research objects or demonstrators;
  • describing and cataloging cloud setups;
  • the role of community/open access experimental frameworks and repositories for virtual machines and data, their operation and sustainability;
  • validation and verification of experimental results by the community;
  • sharing and publication issues;
  • recommending policy changes for recomputability and reproducibility;
  • improving education and training: best practice, novel uses, case studies;
  • encouraging industry’s role in recomputability and reproducibility.

Please see the full call for papers; deadline for submissions (online via EasyChair) is 10 August 2014 17 August 2014.

Tagged , , , , , , ,

The personal cost of applying for research grants

For many academics, this article is a no-brainer. Research grant proposals take huge amounts of time to put together, with low success rates (e.g. EPSRC). It’s a huge cost:

The pressure to win high-status funding means that researchers go to extraordinary lengths to prepare their proposals, often sacrificing family time and personal relationships. During our research into the stressful process of applying for research grants, one researcher, typical of many, said, “My family hates my profession. Not just my partner and children, but my parents and siblings. The insecurity despite the crushing hours is a soul-destroying combination that is not sustainable.”

 

Tagged ,

Critical questions for computer science education research

Over the past two years, we have seen wholesale reform of computing (and more specifically, computer science) education in the UK. In England from September 2014, a new national curriculum subject Computing, with a challenging and aspirational programme of study (“A high-quality computing education equips pupils to use computational thinking and creativity to understand and change the world.”) will replace ICT; in Scotland, we see Computing Science forming part of their Curriculum for Excellence; in Wales, September’s review of the ICT curriculum is shaping the ongoing Curriculum for Wales review; along with burgeoning activity in Northern Ireland.

While there is a large corpus of computing education research, along with national and international policy reports, such as the ACM/CSTA’s Running on Empty (2010), the Royal Society’s Shut down or restart? report (2012) and ACM Europe’s informatics education report (2013), there still remain a number of critical questions in computer science education. The recent announcement of the UK Forum for Computing Education provides an opportunity to support this important research agenda. Further to a group discussion led by members of the CSTA at a recent ACM Education Council meeting, the following list of questions cover a breadth of issues and reflect the deep need for further research-grounded solutions to the issues we face.

  • What are the indicators of incoming student success in introductory level computer science in colleges and universities?
  • Does computer science learning in schools contribute to success/improvement in other disciplines, especially mathematics and science?
  • What is the link between age/educational development and the potential to learn and master computer science concepts?
  • Are there issues of ergonomics in the introduction of computing devices with young children?
  • Is there a link between previous mathematics learning and success in computer science at school level?
  • What are the major factors that lead to students making early choices not to pursue computer science?
  • What is the role of informal education programs in scaffolding learning in computer science, especially in communities where access to computer science learning in school is limited?
  • What are the potential benefits and drawbacks of MOOCs in school student learning?
  • What are the potential benefits and drawbacks of MOOCs for the professional development of computer science teachers?
  • What models professional development are most effective for improving teacher mastery of computer science concepts and pedagogy?
  • What are the impacts of current efforts to market computer science to students?
  • To what extent do poverty and lack of home access to computer science tools impact computer science performance and or interest in school?
  • Do one-to-one devices per child programs have any impact on computer science interest or performance?
  • What are the major factors in computer science teacher retention?
  • What is required to increase the availability of teacher preparation programs for computer science teachers?
  • What is the impact of transitioning the the content of teacher preparation courses in “educational technology/AV” to a focus on computational thinking across STEM?
  • What is the ideal balance between content knowledge learning and pedagogical learning in computer science teacher preparation and alternative certifications?
  • Do hybrid programs (educators and volunteer partnerships) improve student access to rigorous computer science courses and increase the pool of well-prepared computer science teachers?

Which of these do you think is most important? And what is missing? (the questions are listed in no particular order and have been labelled alphabetical for easy referencing in the comments)

Tagged , , , ,

Paper at AI-2013: “‘The First Day of Summer': Parsing Temporal Expressions with Distributed Semantics”

In December, my PhD student Benjamin Blamey presented a joint paper entitled: ‘The First Day of Summer': Parsing Temporal Expressions with Distributed Semantics at AI-2013, the 33rd SGAI International Conference on Artificial Intelligence in Cambridge.

If you do not have institutional access to SpringerLink, especially the Research and Development in Intelligent Systems series, you can download our pre-print. The abstract is as follows:

Detecting and understanding temporal expressions are key tasks in natural language processing (NLP), and are important for event detection and information retrieval. In the existing approaches, temporal semantics are typically represented as discrete ranges or specific dates, and the task is restricted to text that conforms to this representation. We propose an alternate paradigm: that of distributed temporal semantics –- where a probability density function models relative probabilities of the various interpretations. We extend SUTime, a state-of-the-art NLP system to incorporate our approach, and build definitions of new and existing temporal expressions. A worked example is used to demonstrate our approach: the estimation of the creation time of photos in online social networks (OSNs), with a brief discussion of how the proposed paradigm relates to the point- and interval-based systems of time. An interactive demonstration, along with source code and datasets, are available online.

 
(see Publications)

Tagged , , , , ,

Grant applications, early 20th century style

warburggrant

Facsimile of a research proposal submitted by Otto Warburg to the Notgemeinschaft der Deutschen Wissenschaft (Emergency Association of German Science), c.1921.

The application, which consisted of a single sentence, “I require 10,000 marks“, was funded in full.

(read the full Nature Reviews Cancer article)

Tagged , ,
Follow

Get every new post delivered to your Inbox.

Join 367 other followers