What I learned about data analysis from doing psychotherapy

As I note on my “about page,” I am in a clinical psychology doctoral program. So while I have an interest in applied statistics for social science research, I also have a passion for psychotherapy and the treatment of psychological disorders. I just finished my 4th year doing psychotherapy with adult clients. It was my first year being trained by psychotherapists in the community as opposed to academic psychologists who mostly do basic research on psychological disorders. With this training, I doubled my knowledge in the past year compared with the previous three years. I owe it to great supervision and challenging clients, which both taught me how to be a better psychotherapist.

As I reflect on the past year, I realize that many of the things I learned about psychotherapy can be applied to data analysis. I thought I would summarize six of them below.

1.     Research versus practice

Psychotherapy: Many disciplines have written about the discrepancy between research and practice. It often takes on a similar form whatever the discipline. In clinical psychology, researchers develop and evaluate the efficacy of new and existing psychotherapies. Because researchers use the scientific method, they need to have precise measurement and control of extraneous variables. For example, researchers will obtain a sample of people with primarily (or even only) a particular categorical diagnosis, such as major depression disorder. Then the researchers will specify the frequency, duration, and content of each psychotherapy session. These are often described as ideal psychotherapy conditions and the researchers are determining how well the psychotherapy works in these ideal conditions. However, psychotherapists don’t see idealized clients. We see heterogeneous individuals who might have insomnia first and depression second, or might not want to follow the protocol researchers developed, or might be limited by financial resources. The researcher controlled for these extraneous variables by random assignment across individuals. However, the psychotherapist who is working with one particular client cannot ignore these extraneous variables. The psychotherapist seeking to do best practice needs to take them into account and adapt their psychotherapy based upon them. This requires greater flexibility than the researcher’s new psychotherapy manual suggests.

Data Analysis: Data analysis is practice, just like psychotherapy is. The researchers are the statisticians who create new statistics and evaluate new and existing statistic’s validity. Similarly, statisticians study data in ideal conditions. They often make up a hypothetical population that meets standard statistical assumptions (e.g., multivariate normality) and then randomly sample cases from that fake dataset (e.g., Monte Carlo Simulations). Real data do not conform to these statistical assumptions. That means when a data analyst is deciding constitutes best practice for a dataset, the idiosyncrasies in the real data must be attended to. My datasets do not look like the fake data statisticians make up, just like my psychotherapy clients do not look like Sally in Judy Beck’s Cognitive Behavioral Therapy: Basics and Beyond.

Example:  I was creating individual growth curves for a dataset. I wanted to save the individual growth curves and then use them as a predictor in a regression model. The statistical research indicates that growth curves estimated via Bayes estimates from a mixed effects model is best practice compared with ordinary least square growth curves. The Bayes estimates “shrink” the growth curves to counteract the added variance in ordinary least squares due to measurement error. So following the research, I calculated the growth curves as Bayes estimates. However, the growth curves were coming out really weird. What I realized is that the distribution of individual growth curves was not normal and was in fact very positively skewed. Essentially, most people didn’t change at all, but some people increased a little bit and only a few people really increased (no one really decreased). The Bayes estimates from a multi-level model assume the growth curves are normally distributed. Given that assumption, Bayes estimates are better than ordinary least squares growth curves. However, that assumption was not true for my specific dataset. So I instead used ordinary least squares growth curves, which allowed for the distribution of individual growth curves to be very positively skewed.

2.     The three legs of the stool

Psychotherapy: One of the other student psychotherapists at my rotation taught me about the three legs of the psychotherapy stool. The idea is that no stool can stand on its own with 1 or 2 legs, but requires 3 for a solid foundation. The three legs of psychotherapy practice are 1) scientific research 2) clinical experience and 3) client preferences. When all three are used, psychotherapy is optimal. Some more intuitively minded psychotherapists have a tendency to disregard scientific research. Some more scientifically minded psychotherapists have a tendency to disregard clinical experience. Some psychotherapists who ascribe rigidly to the medical model disregard client preferences. Each of these practitioners are only including 2 of the stool’s legs.  I will focus on the scientifically minded psychotherapist because that in particular applied to me. I tended to disregard clinical experience or clinical wisdom. The problem is that scientific research is a) nomothetic and b) has not studied all relevant variables. Scientific research indicates what psychotherapy works best on average. But when see a client, I don’t have an “average” person, I have one particular person. Psychotherapy research has not evaluated the impact of the plethora of individual difference variables. Therefore, I need to use empirical evidence from my own clinical experience and the clinical experience of my colleagues (or the clinical experience of an author from a book). While not as systematic as empirical evidence from a scientific study, it is better than assuming your client is some hypothetical average person.

Data Analysis: I came up with the three legs of the data analysis stool: 1) statistical accuracy 2) ease of interpretation 3) data analytic burden. The first has to do with mathematical research done by statisticians. The statistical researcher seeks to outline best practice for data analysts by determining what statistics to use and how to use them. As point 1 above delineated (i.e., Research versus practice), local statistical accuracy includes taking into account the idiosyncrasies of the specific dataset in front of you. A data analyst cannot blindly apply statistical research to his work without critically thinking about how the nomothetic best practice might need to be modified for his idiosyncratic data. That is not the only thing to consider as a data analyst though. The ease of interpretation of statistical results are important to consider as well. You probably remember the question: “If a tree falls in a forest and there is no one there to hear it, does it make a sound?” My own variant of the question is, “If a data analyst presents results and no one can understand them, did he do his job?” Data analysis for social science will be read not only by other scientists, but also lay people seeking to better their lives and communities. I advocate for publishing results that are interpretable by your non-scientist friends and family. Personally, I ask myself if my late grandma – who was only educated up to 8th grade – would be able to interpret my results. If not, I seek to simplify, potentially at the cost of some statistical accuracy. After all, statistical accuracy is next to useless if no one can interpret that accuracy. The third stool is data analyst burden. We are all constrained by time. If we are an academic researcher, we have classes to teach, exams to grade, faculty meetings to go to, grants to write, etc. If we work for a company, we have other clients to analyze data for, we have company emails to respond to, we have employees to supervise, etc. Data analysts do not have time to learn the state of the art best practice for every type of analysis. Data analysts need to consider how much time and energy doing best practice will take and decide if it’s worth it. If a much more complicated analysis will only lead to slightly more statistical accuracy or easier interpretation, it might not be worth it.

Example: In my own work, I was doing mediation from multiple imputed data. I know from the statistical research that the bootstrapped confidence interval is best practice for estimating the uncertainty around an indirect effect because the sampling distribution is mathematically non-normal. I knew how to do bootstrapping with complete data, but not with multiply imputed data. So instead I computed the Sobel test to estimate my uncertainty. I knew the Sobel test was less statistically accurate because it assumes a normal sampling distribution of the indirect effect, but it saved me hours of work. I could have read new quantitative psychology articles about how to estimate bootstrapped confidence intervals with multiple imputed data and search for an R package to do the analysis (or worse write my own R code). But I was up against a deadline, had other work to do, and wanted to sleep at least 7 hours that week. And that is okay. It is unreasonable to expect everyone to learn the most statistically accurate way to do every analysis. Furthermore, I converted the indirect effect from unstandardized units of summed likert items to percent mediation. Percent mediation is mathematically problematic as an effect size because it can be negative; however, the ease of interpretation is fantastic. My grandma could understand % cause, but she could not understand X units of change in summed likert scale items for every one unit change from a different set of summed likert scale items. Heck, I can’t understand that!

3.     You learn from the synergy of theoretical reading and practical experience

Psychotherapy: I have always prided myself on doing my homework with clients. I have read additional therapy books for understanding diverse clinical orientations and complex presentations. And my psychotherapy supervisor knew this. However, I will never forget after bringing up a new book I was reading on psychodynamic therapy during one of our meetings, he didn’t seem impressed. He said I would learn more from seeing many different clients where cognitive behavior therapy didn’t work than reading a textbook on psychodynamic therapy. I couldn’t learn how to be a great psychotherapist by just reading textbooks on clinical theory and techniques. Practical experience was necessary. Reading and learning from colleagues (we can think of reading as a form of learning from a colleague – one you don’t know) was necessary, but not sufficient. You will not be an expert at treating a particular psychological disorder after seeing just one client, no matter how many textbooks you read about the disorder. Clients are heterogeneous and you need to learn how to flexibly adapt and polish information from reading into practice.

Data analysis: You can’t learn data analysis just from a textbook. Learning the theory behind various statistical analyses and how to use the software programs is necessary, but not sufficient.  You need to analyze datasets – real datasets with non-normality, small samples, missing data, poor measures, etc. A statistical researcher who has never analyzed real data is going to struggle. They might fully understand the math behind the analyses and even know how to code the software functions from scratch; but that won’t tell you how to deal with the pragmatics of real data.

Example: Before doing my first multiple imputation model for missing data, I read two books on missing data and multiple imputation. I thought I knew multiple imputation and could do the analysis on any dataset. I understood the statistical theory and the software syntax. So I go to use multiple imputation for the first time for a longitudinal dataset with 0% missing data at time 1 and 50% missing data at time 4. I ended up with 95% missing information for all of my effect sizes, which is ridiculously high. I knew something was wrong, but I didn’t know what. I realized I clearly did not know multiple imputation well enough – at least not in the context of real data. The issue related to imputing individual growth curves compared with static total scores. The two textbooks I read didn’t have a chapter on imputing individual growth curves, so I didn’t know what to do in that specific context. Just like a psychotherapy book won’t have a chapter on obsessions related to fear of thinking about the image of a lamp shade all day. These two situations are too nuanced. So I bought a third book that went into greater detail concerning the statistical theory behind multiple imputation. Through iteratively going back between the data and the textbook, I solved the issue and ended up with 60% missing information, which was more expected.  The textbook could not tell me how to analyze my particular dataset and the textbook alone did not motivate me to delve into the details of the statistical theory. For my data analytic problem, the combination of real data and reading was needed.

4.     The conceptualization is foundational to everything you do

Psychotherapy: One thing I gained a much greater appreciation for in my most recent year of psychotherapy training is that the conceptualization of your client is the foundation for everything you do. You cannot adequately treat someone until you have fully assessed and understood them. Previously, I was always confused by the term “conceptualization” and didn’t understand what exactly it meant. When I asked previous supervisors about it, I seemed to leave meetings more confused. My supervisor this year gave me a great, simplified definition:  Why does it make sense that your client has the symptoms they have, or more generally, the behaviors, cognitions, and emotions they have. It is answering the “Why” behind your the local clinical assessment data you receive. Of course, in order to develop a conceptualization, you need to first conduct a very thorough assessment. Another aspect of psychotherapy I came to appreciate. I used to view in-depth assessment as a waste of time and would rush into interventions. I now know that great psychotherapy follows great clinical assessment.

Data analysis: The conceptualization of your data is why is looks the way it does. The first step is determining what your data look like – the assessment. Many times people (including myself) fail to “get to know” their data. They jump straight into analyses. While that can work if the dataset is really simple and clean, data rarely are simple and clean. (Same with psychotherapy. If your client is really simple and non-complex, an in-depth assessment might not be needed. But since when are people non-complex?) First, you want to do exploratory data analysis and assess and conceptualize your data. Is that total score positively skewed? Why might that be? Do the items you believe measure the same construct hang together? If no, why not? How much missing data is there? Why are they missing? Skipping exploratory data analysis is either going to lead to hitting a road block when doing the analysis (e.g., a software error message you don’t understand) or worse, incorrect results. I personally, don’t trust a previous data analyst’s total scores (similar to the idea that you should not trust another psychotherapist diagnosis, but always do your own independent assessment). I always (re)create my own total scores because I want to be sure they contain the items I think and use the maximum number of missing items I am comfortable with.

Example: I recently did some data analysis where I failed to check my total scores for outliers. I ran the results, wrote up the paper, and submitted it to an academic journal for publication. It was rejected and we were making revisions for submission to another journal. I realized that I had failed to identify an extreme outlier due to a human data entry error that was severely skewing the results. I had previously thought, I don’t have time to thoroughly assess my data and get a good conceptualization of what datums makes sense and what don’t. Instead I will save time and just assume this dataset is straight forward and move straight to the analysis. I have now learned that the front end assessment is almost always worth it. You have to take the time to get to know your data with data cleaning and data exploration to inform your data analysis selection and implementation.

5.     There are multiple right ways to go

Psychotherapy: Coming into my most recent year of clinical training, I was often stressed with the task of finding the single, correct way to proceed with a client during a session. I debated different Socratic questions in my mind or debated going behavioral vs. cognitive or problem-solving vs. acceptance. I would then come to supervision meetings, tell my supervisor what decisions I had made during session, and ask if they were the “correct” decisions. Instead of giving a direct answer, he often said “I think what you did was justifiable. I also think the other option you were considering was justifiable.” I would ask, but what is the “correct” one?? I came to learn that there is no one right way to go with a client. Instead, there are multiple right ways to go and preferences can dictate which final decision to make. I could start with either cognitive therapy or behavior therapy with a client – either was justifiable. Now, that does not mean that all options are right. Multiple right answers still allows for a plethora of wrong answers. But I had enough training to in general steer clear of the wrong answers.

Data analysis: There is rarely one way to analyze data. There are usually multiple justifiable ways to analyze your data. With advanced statistics, there are more and more detailed decisions to make. Do we assume the variances are equal in a t-test? Do we use a linear regression or a gamma regression for a positively skewed outcome? Do I use regular or sandwich standard errors? Do I allow covariances within my latent profiles or fix them to zero? Which decision to make will depend not only on the idiosyncrasies of the data, but also a data analyst’s preferences and training. And that is okay. Trying to chase down the “correct” analysis to run is a futile endeavor. Instead of asking myself if my analysis is correct, I know ask myself if my analysis is justifiable. Of course, there are still many wrong ways to analyze data. Another benefit of this point is enhanced collaboration. If you believe your way of analyzing data is the single best way to do so, you will struggle to collaborate with other data analysts. But if you view core issues as facts, but peripheral issues as preferences, you will be able to work with many different collaborators. I have heard some data analysts say you should never use frequentist statistics and only Bayesian statistics. If you hold that (prior) belief, you will find yourself struggling to work with many social scientists as frequentist statistics is much more commonplace. Acknowledging multiple right answers is important not only for your own data analysis, but also collaborative data analysis.

Example: I was conducting a confirmatory factor analysis on a set of 7 point likert scale items. The statistical research suggests best practice with likert items is using polychoric correlations with a weighted least squares estimator. I did that analysis and got the results. I was telling one of the psychology professors about the analysis and he said, why didn’t you just you Pearson correlations with the maximum likelihood estimator. I explained to him that best practice was this other approach and references a couple papers from statistics journals. He said he finds that with 5 or more response scales for likert items, Pearson correlations and maximum likelihood estimation do just fine. So I went back to my data and tested it out for myself. Turns out I got almost the exact same results. The results with the polychoric correlations had very slightly larger factor loadings, but not enough to matter. It was clear that in that case, either type of confirmatory factor analysis was a justifiable option.

6.     Principles of practice are more important than protocols of practice

Psychotherapy: I learned psychotherapy from protocols, sometimes called treatment manuals. Whenever I got a new client and had identified their diagnosis; I would search in our training clinic’s library for a protocol for that diagnosis and age (e.g., adolescent vs. adult). I would then start with the protocol’s session 1 agenda and worksheets. This worked as a starting place for learning, but not as a system for sustained psychotherapy practice. As a psychotherapist, you cannot control the client – they have their own agency. Oftentimes, they will not let you walk them through the protocol session by session. Other times, the protocol does not fit for them because they don’t have that exact diagnosis or are in a unique life circumstance. Still other times, the protocol fits, but isn’t working. The protocol is not necessarily written for flexibility and flexibility is almost always required when delivering psychotherapy. The question then becomes, what components of the protocol are core and which are peripheral. Which components of the protocol are targeting the mechanism of change and which are less important. This is where principles come into play. The principles behind psychotherapy are able to guide the psychotherapist on how to adapt the protocol. If you understand the principles behind the protocol, you are able to create a unique protocol for that specific client that you have never used before. But because you know the principles, the mechanisms of change will still be a part of the psychotherapy.

Data analysis: Some statistical textbooks provide a formula for data analysts to follow for running certain analyses (e.g., such as a multi-level model). Protocols are limited though because they assume an idealized setting. Data is messy and often deviations from protocols are required. You have to understand the principles behind the protocols to know HOW to correctly adapt them. Without the principles, you are running blind when trying to determine which aspects of the protocol are core and which are peripheral. One big limitation with a protocol of practice is that you don’t know what to do when you hit road blocks. The protocol doesn’t tell you what to do if X, Y, or Z happen. They tell you what to do if everything goes smoothly. The principles of practice guide you how to trouble shoot and get around road blocks with your data analysis. With the principles behind the protocol, you can improvise a unique variant of the textbook formula for that particular dataset.

Example: I was testing measurement invariance of a structural equation model and was using an estimator that scale-shifted the chi-square value to increase statistical accuracy. I was then computing conventional model fit indices (e.g., RMSEA) from the scale-shifted chi-square value and degrees of freedom. Because my sample size was so large, using the chi-square difference test to compare measurement invariance models was not very helpful. Instead, I was comparing the various model fit indices across models. The textbook I had on structural equation modeling talked about how multi-group measurement models that force more parameter estimates to be equal across groups, will have worst model fit. Everything was going fine until I hit a road block. More parsimonious models – models that forced more parameter estimates to be equal across groups – were fitting the data better. Even for model fit indices that didn’t account for model parsimony (e.g., degrees of freedom). I was analyzing the same data and when adding model constraints, my model fit went up. The textbook essentially said this was impossible and because all I had done was read the textbook, it made no sense to me. I had hit a road block and didn’t know what to do next because I didn’t understand the principles behind the scale-shifted chi-square value. I didn’t really understand scale-shifting the chi-square value at the time, but simply knew I had to do it in that particular case. So I went to read more about scale-shift chi-square values. What I learned was that after scale-shifting the chi-square values, they are no longer comparable and because the model fit indices are functions of the scale-shifted chi-square values, they were not comparable either. The only valid comparison was a chi-square difference test of the original chi-square values before the scale-shifting. The original chi-square values were used to determine the parameter estimates and the scale-shifting was for model fit and standard errors. Without understanding those principles behind chi-square values in structural equation models, I was at a loss – with or without the protocol.