Dedoose is an easy-to-use, collaborative, web based application that facilitates all types of research data management and analysis.
Here's what you need to know about how to use it.
Access the Analyze Workspace by clicking the ‘Analyze’ button on the Dedoose main menu bar:
The Analyze Workspace offers a number of chart ‘sets’ based on the various aspects of a project database. The sets are organized in a folder system so it is easy to find the charts, tables, or plots you are looking for depending on the immediate needs of your work. Here’s an example of the charts in the ‘Mixed Methods’ set.
Dedoose Analyze Workspace Mixed Methods Chart Menu
It is also worth noting that a number of the charts, tables, and plots will appear in multiple sets within the ‘Chart Selector’ depending on the aspects of the database being represented in the visual. For example, the ‘Excerpts Per Document’ chart will appear in both the ‘Document Charts’ and ‘Excerpts Charts’ folders.
The Dedoose Analyze Workspace offers a wide variety of data visualizations. These visualizations can be used to examine the general nature of your data, understand how the code system has been applied to the qualitative content, and expose patterns of variation in the qualitative data and coding activity across sub-groups. The visuals can be exported to facilitate the presentation of research findings, and as filters or windows to drill deeper into findings. These charts, tables, and plots are designed to be as informative, intuitive, and transparent as possible. They can be used in numerous combinations and be flexibly adapted to address particular research questions. Finally, most visuals have a variety of options that will be found in upper right corner of display, so look for icons to swap display format, expand view, export, include sub-code count …
Fundamental to all aspects of the Dedoose design is transparency. Our goal is to ensure that all users can quickly learn to navigate smoothly throughout Dedoose and easily access the features, input data, manipulation, and analysis tasks. Noteworthy is that in the Analysis Center and throughout Dedoose, every bar, slice of pie, bubble, and cell in a table is ‘HOT,’ i.e. dynamically linked to the underlying qualitative data. One click on the aspect of the visual will pull up the associated qualitative content being represented by the bar/slice/bubble/cell. From there you can explore further to more deeply understand the nature of the qualitative content or export for use in a manuscript or presentation preparation.
Chart Expansion and Export
Throughout Dedoose, there are two common controls in the panel header for exporting and viewing the chart in full screen:
Introduction to Dedoose Charts, Tables, and Plots
Again, while a number of the data visualizations will appear in multiple sets within the ‘Chart Selector,’ we introduce the key types here before moving on to a more detailed description of what can be found within each of the charts/tables/plots.
The charts in this screenshot represent the relative number of excerpts that have been created within each media file. As a reminder, each bar in this display is ‘hot’ and a simple click will immediately pull up and present the underlying excerpts to facilitate interpretation of the graphical image.
Example: Excerpt Count x Media Chart
This next screen shot is an example of one of the many Dedoose frequency tables. This table presents the frequency with which each code (columns) has been applied to an excerpt in each media file (rows).
Example: Code Application Chart
Tables like these are useful in visualizing how a code system has been applied across media files/data. While it may seem easy to recall the pattern in coded content by case, remembering interview #1 after working with interviews #2 to #10 is very difficult. Moreover, team members cannot be familiar with the work done by others. Emerging patterns in how a code system has been applied can play an important role in data analysis and interpretation. It is under these circumstances where presentations of data as in this Code by Document table can be illuminating. Remember, clicking a cell in the table will pull up the underlying content for further consideration and the totals row and column allow for pulling up batches of excerpts by media or code. Also, note that frequency is mapped to the color spectrum—reds being more frequent and blues less—so your eye will quickly move to where the action is taking place. Patterns here can help see how the conceptual framework you have represented in your code/tag system has been applied to (or mapped onto) your source data.
Descriptor ratio charts present the relative numbers of each sub-group for each of a project’s list-type (categorical) descriptor fields. These visualizations facilitate an understanding of variation within a project sample and can serve as filters or windows on the data provided by a particular subgroup for segmentation or subgroup specific analysis.
Example: Descriptor Ratios Chart
Code Application by Descriptor Charts
The Code Application by Descriptor field charts (including the dynamic descriptors which are designed to support analysis of change over time in you qualitative data) have a number of options and can be some of the most useful visualizations for analysis, interpretation, and communication/presentation of research findings. Essentially, these charts represent the number of excerpts that have been associated with a particular code separately for each sub-group within a descriptor field. In the example below we see charts for each code showing the relative number of excerpts tagged with the specific code separately for sub-groups based on the ‘Mother Primary Language’ descriptor field groups: ‘Bilingual,’ ‘English,’ and ‘Spanish.’
The drop-down menu allows for an immediate reconfiguration of the charts by selecting any of the list-type or grouped number or date/time descriptor fields in the project.
Example: Codes x Descriptor Chart
Other important controls for these charts can be found in the panel sub-header showing radio buttons next to the drop-down menu to switch the chart from relative excerpt count to average weights applied (where used) together with check boxes for ‘Hit/Miss,’ Sub-code Count,’ ‘Normalize,’ and ‘%.’ By default, the ‘Normalize’ and ‘%’ boxes are checked.
Example: Codes x Descriptor Chart with Normalization Off
This ‘non-normalized’ chart, as compared to the original, appears to suggest a markedly high frequency of ‘Letter Recognition’ coded excerpts for the Spanish group. Hence, normalized charts provide a more unbiased perspective of the underlying data
Example: Codes x Descriptor Chart Raw Counts
Descriptor by Descriptor by Code Application/Weight Charts
The Descriptor by Descriptor by Code Application/Weight chart has many options for drilling even deeper into variations across population subgroups. These ‘nested’ or ‘cross-tab’ display charts allow for examination of the qualitative data and coding/weighting activity based on two descriptor fields. The example below shows the average weight assigned to excerpts coded with ‘Reading by Mother' across ‘Mother Primary Language’ within ‘Mother Work Status’ subgroups. These charts can expose variation in value, sentiment, importance, quality, etc. (anything you have used the weighting system to represent) across various levels in the overall population. In the example, we see an interesting interaction pattern with variations in the levels of primary caretaker reading quality as a function of mother language and work status (i.e., excerpts were rated generally higher for not working moms who are bilingual or English speaking but full or part-time working moms who speak Spanish).
Example: Descriptor x Descriptor x Code Weight Chart
Code Weight Frequency by Descriptor Field
The Code Weight Frequency by Descriptor Field Chart allows for a focus on the code weight distribution across descriptor field categories. The following example is based on the ‘Reading by Mother’ code and the ‘General Home Language’ descriptor field. The resulting chart shows clusters of excerpts tagged with ‘Reading by Mother’ across the weight scale and broken out by the home language descriptor field sub-groups. These displays can help communicate complex patterns and provide easy access to subsets of excerpts meeting specific sub-group and weight level criteria.
Example: Code Weight Frequency x Descriptor Field Chart
The Code Weight Frequency by Descriptor Field charts is a prime example of how mixed methods can expose important patterns in research data. Well-designed code weight systems are ‘grounded’ in the underlying qualitative data—they are based on investigator defined variation within the sample population. When such systems are clearly defined and reliably applied, they:
Code Weight by Descriptor Bubble Plots
The Code Weight by Descriptor Bubble Plots allow a four-dimensional exploration and presentation of project data. These plots are based on the average weights associated with the selected codes across the selected descriptor field sub-groups.
Example: Code Average Weight x Descriptor Field Bubble Plot
In the above example, from a study on the hotel characteristics reported as desirable across age and income level, the bubbles represent different income groups. The size of the bubbles represents the average code weight associated with application of the ‘Intimacy’ code, which has been applied when intimacy was mentioned as a desirable characteristic and a weighing to indicate the reported level of importance. The X and Y axes show the average weights associated with the application of ‘Sophistication’ and ‘Cost’ codes respectively. The highlighted bubble indicates that respondents reporting more than $250,000 in annual income report relatively low importance for ‘Cost,’ moderate importance for ‘Sophistication,’ and high importance of ‘Intimacy’ as compared to the other income level groups when making hotel decisions.
Wherever code weight/rating systems are used alongside code application activity (e.g. to index variation in importance, strength, sentiment, value, quality…), these plots can quickly expose complex multi-dimensional relations between variables across sub-groups. Finally, these plots are easily modified through the drop-down menus, communicate tremendous amounts of information and can be used to access the excerpts represented by a particular bubble or as filters for further analysis.
Code Frequency by Descriptor Bubble Plots
The Code Frequency by Descriptor Bubble Plots allow for a four-dimensional exploration and presentation of project data based on the frequency with which particular codes were applied to excerpts across the selected descriptor field subgroups.
Example: Code Frequency by Descriptor Bubble Plot
In the above example, from a study on the hotel characteristics reported as desirable across age and income level, the bubbles represent different annual income groups. The size of the bubbles represents the frequency with which the ‘Cost’ code was applied to excerpts within each sub-group. The X and Y axes represent the frequency with which the ‘Luxury’ and ‘Warmth’ codes were applied respectively. The highlighted bubble indicates that in comparison to other income groups, respondents reporting annual income of greater than $250K discuss issues of Luxury and Cost in hotel evaluations relatively more frequently and issues of Warmth relatively less frequently.
As with the code weight by descriptor bubble plots, these visuals can quickly expose complex multi-dimensional relations between variables across sub-groups and provide quick access to the underlying qualitative content.
You can also view our blog post on this type of chart.
Code Co-Occurrence Matrix
The Code Co-Occurrence Table provides information about how the code/tag system was used across all project excerpts. This symmetric, code by code, matrix present the frequencies for which all code pairings were applied to the same excerpt and, by default, overlapping excerpts. Such a display can expose both expected and unexpected patterns in which two codes were (or were not) used together. These patterns illuminate how concepts related to the research questions and represented by the code system are combined in the natural schema (i.e., cognitive frameworks that help organize and interpret information) activated by study participants as they report on the topic represented by project codes.
What does it tell us about our data, research questions, and how respondents naturally connect the concepts we are examining when we discover that that codes ‘A’ and ‘B’ co-occur at relatively low rates compared to codes ‘A’ and ‘C’? Dedoose facilitates the process of addressing questions like these quickly and with a variety of attributes to suit the needs of different researcher preferences. It is also important to note that when including counts for overlapping excerpts, the cell values represent ‘hits’ for excerpts coded with both the associated codes AND excerpt with one of the codes that overlaps with an excerpt coded with other code. This feature can be deactivated by clearing the ‘Include Overlapping Excerpts’ box in the upper right corner of the panel.
Example: Code Co-Occurrence Matrix
For example, the highlighted cell in the table above indicates that 16 excerpts or overlapping excerpts were coded with both the ‘Reading by Others’ and ‘Letter Recognition’ codes. This pairing’s relatively high frequency indicates that as participants are thinking and reporting on one of the concepts, they often discuss thoughts about the other. Such a combination suggests that an overarching schema which includes both concepts are being activated as participants formulate their responses. Drilling down to the underlying qualitative data (by clicking the cell and reviewing the excerpts) provides a deeper understanding of participant reports and the naturally occurring patterns in their thought processes.
Observation of patterns in how the code system was applied can illuminate a wide variety of connections within (a) the nature of the conceptual framework represented by your code/tag system and how it was applied and (b) the nature of the data themselves. Patterns like these are often unlikely to be noticed or understood in the midst of coding activity. Yet in the analysis stage, these patterns can be extremely valuable in discovering and understanding how respondents naturally discuss these concepts in combination and how researchers will discuss these organizing principles as they understand and present their findings.
Code Weight Statistics Tables
The Code Weight Statistics Tables offer another means of examining and understanding coding activity in the project. These tables display basic counts and, where appropriate, statistics of how the weights for each code were distributed across code applications. The application count cells are also great shortcuts for activating a code-specific filter or pulling up all excerpts associated with the code for further exploration.
Example: Code Weight Statistics Chart
Descriptor Field by Descriptor Field
The Descriptor Field by Descriptor Field chart is essentially a cross-tab analysis of the relative frequency of members in each sub-group plotted for two descriptor fields—one nested within the other. For example, in the screenshot below, you see ‘Father Work Status’ graphed against ‘PC Reading Change Group.’ The result of the analysis is Chi-square = 3.79 with 4 degrees of freedom—not a statistically significant relationship. This non-parametric statistical analysis is commonly reported in the description of a research project participant population and in discovering how the population may have interacting characteristics that should be identified and considered in the interpretation of study results.
Example: Descriptor Field by Descriptor Field Chart
Code Weight Distribution Plot: The Dedoose code weight/rating distribution plot offers both a visual representation of how excerpts have been distributed across the defined dimension and access to subsets of excerpts via clicking on a particular segment. Continuous numbers like these are typically described by the frequency of data points along the distribution and by characteristics representing central tendency and dispersion including:
Examining this information can help you understand (and report on) how you’ve used the weight/rating system across the coded excerpts. You can control the number of segments using the slider provided in the Dedoose pop-up and then explore the excerpts within each cluster by simple clicking on a segment to pull up the associated excerpts. In the example below from a sample Dedoose code, we see a plot that ranges from 0 to 150 million with 5 segments grouping subsets of values along the range and measures of central tendency and dispersion provided along the bottom of the pop-up.
There are two ways to activate the Code Weight Distribution Panel in Dedoose. First is via the Codes panel.
Note: The ‘Distribution Plot’ button will only appear for codes with the weight/rating system activated and you can also identify codes with weights/ratings activate by the dark outline surrounding their colored marker in the Codes panel.
The second way to activate the pop-up is via the Analysis Workspace:
Descriptor Field Correlation
Descriptor Number Distribution Plot
Descriptor Field T-Test The Descriptor Field T-Test, known outside of Dedoose as the independent samples t-test, tests the null hypothesis that there is no difference between two independent samples. The null hypothesis is the conclusion when there is nothing going on that we can support statistically. When we find statistically significant differences, we ‘reject the null hypothesis’ and conclude that there is significant evidence that the two groups differ.
The normalization function, an option in a variety of Dedoose frequency charts, operates by assigning a weight of '1' to the class with the largest number of members (basis class) and then assigns weights to the other classes as a function of the numeric relation between the number of members in the class to that of the number of members in the 'basis' class. These weights are then used to adjust the number of raw counts to accomplish ratio equivalence across class for visualization and the weighted percentage is calculated based on these adjusted counts.
For example, if we are interested in the relative percentage of times people of different color eyes blink in one minute, we calculate as follows:
Dedoose offers a variety of approaches to filtering your data. Most users analytical needs, much or all of the time, will be fully served by simply using the auto-generated and interactive data visualizations available in the Dedoose Analyze Workspace. The visuals make it easy to discover and explore patterns in their data and then drill in to better understand the rich qualitative stories that live beneath the surface. However, for those wishing to focus their analysis on particular subsets of data without distraction from other aspects of the database, Dedoose offers powerful database-wide filtering capabilities.
Or vs And filtering:
Filtering can include two basic operators in Dedoose: ‘or’ and ‘and’ filter operations. An ‘Or’ relationship filter is used when you would like to cast a wider net that might include a few targets subsets. As a simple example, consider the data that are active by default…everything or the full data set view. If you want to narrow to excerpts tagged with one or more of some subset of codes you can do the following:
An ‘and’ relationship usually defines more narrow data sets. For example, say you’d like to see what Spanish AND female participants contributed to your data set. In this case, the ‘AND’ relationship is between the female AND Spanish speaking descriptor options. Here’s one way to do this:
The excerpts that remain active will be only those which were provided by females who reported speaking Spanish.
Filtering via Chart Selection Reviewer
There are two basic approaches to activating filters in Dedoose. The simplest approach is to filter via the chart selection reviewer activated following a click on some aspect of an analytic visualization. For example, the following series of snapshots illustrates the process.
Clicking ‘Make Active Set’ in the Chart Selection Reviewer activates a filter so all subsequent analysis will only be focused on excerpts from families reporting having male children. The following snapshot shows the impact of this filter:
a. The only active data in the codes by descriptor bar chart b. The ‘Data Set’ icon in the main menu bar is shaded red as an indicator a filter is active c. A 'funnel' icon appears in the chart header which serves as a toggle to view all data or just filtered data when a filter is active
The same visual after clicking funnel icon to toggle filter to ‘off’ position
When any filter is active, all subsequent analysis is focused only on the data meeting the filter criteria. Thus, with this approach the three buttons referencing ‘Active Set’ at the bottom of the Chart Selection Reviewer pop-up allow for filtering the entire database based on the sub-set of data represented in the list. Beyond simply activating a filter as in the example, one can also add or remove other subsets to/from the active filter by using the ‘Add to/Remove from Active Set’ buttons.
Excerpt Workspace Filtering:
The Excerpts Workspace allows for more complex filtering options than is available anywhere else in Dedoose as you can filter down to specific excerpts in addition to the categories into which they fall. It is worth noting the difference between filtering in the Excerpts Workspace and the Data Set Workspace. In the Excerpts Workspace, filtering can be done with one or a series of ‘and’ operations in an iterative way (which then lets you add ‘or’ operations to the overall filter being created). This means that: a. once each round of filtering is established the only data displayed will meet all the filter criteria; and b. adding additional ‘and’ operations allows for the inclusion of ‘or’ operations. That is, for example, ‘males and high income’ OR ‘males and moderate income.’ Unlike the Data Set tab, which can only filter with a single ‘and/or’ operation. This means that any given piece of data you wish to include in the overall filter only needs to meet one of the rounds of filtering to build the final result.
To add additional filters:
Filtering from the Analyze Workspace:
In addition to providing various counts and visual displays of distributions for your data set, the Analyze workspace creates sets of excerpts that can be very difficult to define manually. For example, excerpts that overlap with the excerpts of a specific code. Luckily, you can create data sets from these as well!
Code Tree Filtering:
Filtering codes via the code list allows you to select individual codes and remove them, add them or set them as active data sets in Dedoose. Normally, filtering via the Codes section of the Data Set tab is sufficient as it removes the other codes from view, however if you are interested in how the auxiliary codes overlap with a set of codes you select, this is the way to go.
Make Active Set - This clears any current data set and sets the code's excerpts as the current data set. Add To Active Set - This adds the excerpts to the current data set. Remove From Active Set - This removes the excerpts from the current data set.
Filtering via the Data Set Workspace
Though filtering via the chart, table, and plot shortcuts will serve many needs, the Dedoose Data Set Workspace serves those requiring a more complex database query and for those wishing to save filters for later use. The snapshot below shows the ‘Data Selector’ pop-up activated by clicking the Data Set Workspace icon.
In the Data Set Workspace filter can be defined based on any object in the database including descriptors, media, users, and codes. Two important things to keep in mind when creating filters are:
Defining a Filter
So, what constitutes a complex query? Again, filtering in Dedoose can include criteria from descriptors, media, users, and/or codes and the sole focus is on the activation or deactivation of excerpts. Many users are familiar with the use of Boolean operators in defining a database query. Here we will work through an example in our sample data to illustrate how to define and activate a more complex query in Dedoose. Let’s imagine we wish to narrow our active set to only those excerpts meeting the following criteria:
Setting up this somewhat complex query is relatively transparent when working in the Dedoose Select Data Workspace and, while there are different approaches to setting up the same query, it is most straightforward by making basic use of the SuperMegaGrid features to isolate and activate the data of interest.
Note: For clarity, when working in the SuperMegaGrid we recommend clearing all columns except those of interest prior to filtering and you will see folders for each appear in the ‘Filtering’ panel AND ‘Deactivate All’ for a more forward approach to setting criteria for those excerpts you intend to activate.
Here are the steps:
The following screenshots show a basic code by descriptor chart with the filtering active:
…and toggled to the full data set:
Saving a Database Filter
Regardless of the complexity of activated filters, users and research teams may want to save and re-load database filters (queries) that have been defined for particular purposes.
Saving an active filter that may be useful in the future for review or as new data are added to the database, also takes place in the Data Selector workspace. To save a filter—WHEN THE FILTER IS ACTIVE:
Re-Loading a Saved Filter
Re-loading a saved filter (or database query) is a simple procedure in Dedoose and is achieved via the Data Selector workspace. To re-load a filter:
Filters are deactivated by either exiting the Dedoose App or:
Searching across documents and excerpts:
Searching across document and excerpts allows you to pull up a list of items that have a specific text string to filter for items with term or phrase of interest.
For document content searching:
For document title searching:
For excerpt searching: