A little late to the party, but things have been busy so the blog post slipped my mind (oops). But here I am, graduated! It’s an incredible feeling, and I’m ready to take on the world again in the fall when I start a masters program in counseling psychology. Last week was also my last real day in the lab, and yesterday I had a chance to say goodbye to everyone.

Over the course of last week, I worked to wrap everything up by making sure all of the materials from the last few sessions were updated in the appropriate places. This included inputting the data from the post-session surveys, inputting the scores for the algebra pre and post-tests, uploading and slicing the remaining videos, and getting the first real batch ready for mechanical turk. I rated our last test batch in-house to help us determine if we could reasonably proceed, and we ended with 0.63 K’s alpha (which I believe is pretty similar to previous IRR values).

On special request, I also ran one last demonstration of the WoZ setup, this time for a group of high school students. The event featured different stations with student projects and I setup our agent to help represent our lab. Overall it went well, considering that each cluster of students spent about 2 minutes at a given station, so the explanation had to stay fairly short and sweet. It was also pretty loud from music, so it was difficult to hear Jaden when I demonstrated the different types of utterances Jaden has.

Overall, I had a great last week and I feel good knowing that everything ended on such a good note. Ever since I began as a research assistant (back in 2015, although I was off-and-on and ended up skipping on semester), I’ve been involved with different stages of the project. I worked my way up from just annotating for 10 hours a week to actively recruiting for the WoZ study and managing data. I’ve seen a lot of growth in myself with the help of everyone in the lab, especially my mentor who especially pushed my independence in the final semester. I’m sad to leave, but I’m hopeful that whoever falls into this project can bring it to great places.

And that’s a wrap.



While most of my week was spent at AERA, it was a long one between traveling and returning to classes. Monday of AERA was definitely more exciting than Tuesday, which had more poster sessions than paper sessions. I ended up at 3 different poster sessions, I was planning on ending my day at an education and technology event, but it ended up being a closed meeting. I chose the poster sessions because I wanted to experience how broad the field of education could be, but I didn’t get as much out of the posters as I did from the speakers. I seem to learn more when something is explained verbally, and the paper sessions also made it easier to connect research of the same topic (it also helped that the paper sessions were organized by topic and contained about four speakers, while poster sessions were general and each had over 20 posters). While the poster presenters did have the spiel they provided to observers and answered questions, I definitely learned more in the paper sessions.

Going back to the ArticuLab, part of my wrapping-up experience is working with others who are interested in working with the RAPT data we have. Last week, I reviewed the materials we have available in the 2016 dataset (things like the audio and video, materials for conducting the studies, annotations, etc.) with someone who is interested in exploring nonverbal behaviors. We also did a brief run-through of the WoZ setup, and how the tutoring sessions work. The agent we use for RAPT doesn’t have much movement, so that’s one area that he is interested in looking into. I also met with a grad student who’s interested in the annotation process we’ve used and wants to work with social annotations. We went over the process for annotation training and reaching reliability, the annotation manuals we have on file, and my personal experience with annotating. We discussed the difficulty we had last year in annotating for self-disclosure, so I was excited to hear that he’s planning on elaborating on types of self-disclosure and other social behaviors (intrinsic and extrinsic) in the annotation process. He also wants to attempt to facilitate annotating through a platform such as MTurk, and I’m interested in seeing how that process could work out. In my experience, annotation (training, at least) required a lot of in-person reviewing of discrepancies and explanation for annotations. The process would be much faster over MTurk, though, so I’m hoping to keep in contact with him even after I leave the lab.

This is a week of wrapping up for me, as I’ve wrapped up my undergraduate career (last assignment submitted today, graduating in 6 days!) and I’m wrapping up at CMU. Before I go, I believe I’ll just be having one more meeting with my mentor to discuss what we’ve accomplished thus far, the next steps, and so on and so forth.


Hello from New York! This post is a little later than usual because I’ve spent the whole day at symposiums at the AERA conference! Just to briefly overview my trip, I wasn’t able to make it to the beginning of the conference (Thursday) because I was presenting at  a psychology conference in Erie PA Friday & Saturday, but I flew out to NY Sunday morning (in this case, morning=3am) to land at about 11. There were some sessions I was interested in from 3-5, but my hotel room wasn’t ready until 3 so I had to skip out on those (I didn’t think to dress anything but super casual for my flight, and I was trying to avoid being that one person you always see (or your mom warns you about) that shows up underdressed and looks ridiculous). I also made a good call on skipping the 8:15s on Monday (which was partially from exhaustion from all of my traveling), as Manhattan happened to issue a flash flood warning at about 7:30. Another good call on booking a hotel within walking distance to the conference hotels because the rain was pouring for several hours; I also heard the rain was so bad it was leaking into the subway stations AND the cars! That makes walking through ankle-deep puddles in flats seem like a kinder fate (almost).

My day was super packed with symposiums, even by skipping the first ones of the day. My day began with New Directions in Motivation Regulation, which was followed up by Learning and Development Through Pedagogical Designs, The Impact of Instructional Technology on Faculty, Curriculum, and Teaching, and finished with The Impact of Teachers’ Psychological Characteristics on Young Student Experiences and Outcomes. The sessions were almost back-to-back (with about a 15 minute gap), so I got some nice brisk walking time (still in the rain) going from hotel to hotel.

For the most part, the presenters/researchers were dynamic and excited to talk about their work, so it wasn’t too difficult to keep up in a room of education professionals. But then a presenter walks up with a complex model on their slide, says “we’re all familiar with self-regulation models, right?” and the entire room nods in approval, so you spend the next 30 minutes trying to keep up. When I made my planned activities list, I was overwhelmed with all the possibilities and just chose whatever I saw first. When I got here (and had all that spare time) I began looking more closely at sessions available and strategically planned where I had to be an when, so my schedule completely differed the one I originally anticipated. If I hadn’t changed it, I would have taken time to read some of the abstracts that were available. Overall, the sessions were very informational and covered topics such as why we’re motivated, how we can utilize different types of motivation (which I didn’t even realize existed!) to complete various tasks easier, how humor can benefit learning (aka: how memes can help you learn stats), how online classes can both help and hurt education, and how teachers’ moods can affect the way their students learn.

I was pretty interested in the technology seminar, seeing as though my research at CMU is developing a virtual tutoring system for classroom accessibility. This ended up being about online classes and tools for classroom use, and even then it was a little different than what I had expected. Most of the presenters were educators (as well as the attendees), so they had first-hand experience with technology. But the presenters discussed how technology in education can foster distraction from work, students’ anticipations that their professors will always respond hastily, and that students don’t like online work because they don’t like digital text and googling answers. I was taken aback by all of this, but the audience generally seemed to fully support these notions. While yes, I can be guilty of occasional distraction and emailing my professors at 1am about a test or assignment, but they seemed to address the useful nature of the systems and brush them off to favor face-to-face. I’ve always admired learning technology for its ability to make material more accessible (as a student that works full-time, online classes are amazing because I have more flexibility in scheduling, and online textbooks are cheaper for renting and accessing for spur of the moment studying), I think this symposium could have benefitted from learning technologies in tutoring (hey! that’s what I do!).

I was also interested in the teacher psychological characteristics as a psychology major, so it was very interesting to see these researchers discuss how the mental state of educators can impact their students’ outcomes. Teachers that have more depressive symptoms and work-related stress, for example, influence their involvement in learning and their relationships with their students, which both affect their ability to learn. Teachers with less depressive symptoms can also assist in their students’ emotional and behavioral regulation, which can then keep them on-task and learning more.

Tomorrow isn’t as busy of a day because I’ll have to get to the airport and fly out in the evening, but I’m going to look through the program and find some exciting sessions to attend and report on them next week. When I get back, I’m in my last week of classes! From there, I’ll be moving home and graduating before I start grad school at the end of August 🙂


This week started off with some data upkeep for more recent participants: uploading the videos, trimming them, re-uploading them for transcription. The only thing I haven’t gotten to is slicing (but that wasn’t as far up on the radar as transcription was). I also haven’t gotten around to reviewing the pre and post-tests recording scores, but since there were only about 4 new participants it’s not too bad. Most of this week’s excitement comes from the fact that I finally had an opportunity to WoZ completely by myself!

With the last participant I scheduled, I had some technical issues with setup of the desktop. Since most of the setup is actually on the desktop, I was rushing to set everything up and worried I wasn’t going to finish on time (this is the session that didn’t end up happening because they forgot, I think I’ve referred to this as “a test run with more panicking”). We usually set up for participants an hour prior to the start time, but to be safe I wanted to start setting up an hour and a half in advance. This ended up being an excellent idea; since I had significantly more time, I felt more comfortable and had plenty of time to set up, double-check everything, and quickly review the information on the consent form one more time. I finished setup about 20 minutes prior to the session start time, so coming early was definitely a good call. And the participant was there!

I thought the session ran pretty smoothly for the most part. One aspect that I was not prepared for was the time between choosing an utterance to send and the utterance being spoken. The last few participants we’ve had weren’t as talkative, so we’ve been able to suggest trying the next problem, open the problem, and ask what should be done first. This participant was more talkative, though, so when the problem would come up they would explain what they thought they should do. Which is great! But from time-to-time there was enough silence that I thought prompting the first stage of the problem would be good, but then they began talking and the agent interrupted. I was also anticipating some difficulty in choosing the correct utterance when they were struggling with the next step of the problem (which is a lot more difficult than it seems).

After a successful WoZ run, I’m going to finish up sorting out the rest of the materials from the last few participants. I don’t think I’ll get to any participants for the week because it’s AERA weekend! So this week I’m going to be finishing up sorting our data before the conference, and when I get back I’ll be wrapping up my work in the lab 😭


Another week down, 3 (😰) more to go! This is also my second to last week before AERA! As I’m gearing up for the end of my semester and graduation, my mentor and I are tying up the loose ends of the project and planning what we’ll end with. In the upcoming days, I’m going to start by working with the data from the most recent participants we’ve had. This includes trimming the excess stuff from the videos, uploading them for transcription, reviewing pre and post-tests, slicing the videos, and uploading slices to prepare for future MTurk submission.

Speaking of MTurk, we also looked over the sum of our three test batches to look at K’s alpha, Cronbach’s alpha, and ICC between all of the raters, the best 3 raters, groups for all raters with alpha 0.5 or greater, and groups for the best 3 raters with alpha 0.5 or greater. I recall mentioning this last week, but this value comparison also includes the changes we’ve implemented through each new batch to document the steps we took to try to improve reliability. Our numbers seemed to be matching up consistently with results from the human-human dataset, so we’re ready to pick up on rapport rating by uploading larger batches for rating.

Recruitment hasn’t been too successful since two of our scheduled participants either did not confirm their time or had to reschedule and I haven’t received any responses in my efforts to reschedule with them. I’ve had one new participant interested through recruitment, so I’m hoping that I can schedule them in before I have the conference. With my schedule, this gives me about two weekends where I can do a WoZ run completely by myself.


This week, my goal was to run another participant and get set up for our third test batch on MTurk. Overall I was partially successful. I got the next 100 slices uploaded and posted a third test batch for Mechanical Turk on Wednesday with the plan to parse the results over the weekend and check reliability. The results weren’t necessarily bad; our K’s alpha for all 4 raters has remained pretty consistent around 0.5. Looking at the best 3 raters, this number is closer to 0.6. As we keep moving along with our batches, my mentor helped me to organize a spreadsheet with each of our test batch numbers, the number of slices within the batch, the K’s alpha, ICC, and Cronbach’s alpha for all raters, the K’s and ICC of the best 3 raters, the K’s and ICC of the best 3 excluding values less than 0.5, and modifications we made for the submission. This helps us to look at all of the values and to compare reliability alongside the changes we made. From the second test, for example, we had added a second example of neutral rapport in the training portion of the HIT, and K’s alpha for the best 3 remained consistent. Our current values seem to be matching up with batches from earlier in the study for human-human data collection, so we’re going to be prepping a bigger batch to have all of our slices rated. Hopefully soon we can move into data analysis with what we have.

This week I was also getting ready to run my first solo participant without my mentor to guide me. Setup was a little frustrating due to some issues with getting the desktop setup, but I ended up getting all of the programs up and running and tested at the time the participant was scheduled for. They ended up forgetting that they scheduled, so this was more like a test run of setup that was more stressful. They are going to reschedule though, so I’m happy that we’re at least retaining the participant. Since my time is coming to and end, I’ve also been making more recruitment efforts to squeeze in some final participants.

So this week had ups and downs, but I’m excited for the progress we’re making. I’m going to get the next batches uploaded and ready to go so that we can submit the assessments as soon as we get approval from the PI. I’ll also be waiting for some responses to recruitment and hope I can get some scheduling completed.



This week wasn’t necessarily full of excitement, but I made some progress in what I’ve been working on for the past few weeks. In a mini-recruitment, we ended up getting two more interested participants so I have them scheduled for this week. Now that the rest of the semester is looming over me (I’ll be graduating in almost a month), we want to end with at least 36 participants to have 12 in each session condition (task-only, random social, adaptive social) so I’m going to try to push harder on recruitment.

I’ve also been getting ready for the next test batch on Mechanical Turk by looking at the second test run and the groups of raters with the lowest reliability. I may have previously mentioned that this batch showed that there was higher reliability in groups that shifted away from rating neutral (4) and rather rated high or low, and lower reliability groups had difficulty deciding on neutral rapport. So I took a look at the lowest reliability group to determine if there were any particular slices that were causing trouble to see if I could improve the third batch. There was one slice that was rated as neutral by 2 raters, and low and high by the other two. When I looked at the video, I noticed that although there is a lot of silence, the student responds to the agent’s question and the agent responds to that after a pause. In the prompts for high, neutral, and low rapport raters also seemed to consider silence as a major factor in rapport. Too much silence=low rapport. Neutral rapport was this weird middle ground, as high rapport was no silence and back-and-forth reciprocation. So this week I’m going to take a look at our current neutral examples in the assignment to determine if one of the examples is swaying ratings in the wrong way (the example is more high/low rapport but we have it as neutral) and switch it to this problem slice. Hopefully this will lead to better rapport when I post the next batch.

Finally, I’ve spent some time this week working out budgeting with my mentor for the conference I’ll be attending in April. He gave me the suggested numbers for the budget items like transportation and lodging while I worked out the planned activities report of the sessions I wanted to attend. There were a lot. The conference spans several hotels, several floors, several rooms, several hours. It was overwhelming, to say the least. After some planning, here’s what I came up with:

Sunday 04/15

Traveling/registration (time TBD)

10:35-12:05 Division E Fireside Chat. Adolescent Mental Health in Schools and the Community: Supporting Students and Families in an Uncertain Political Landscape

2:45-4:15 Learning in Diverse K-12 Settings: Exploring the Hidden Curriculum and Critical Disconnects

6:30-8:00 Design and Technology SIG Business Meeting

Monday 04/16

8:15-9:45 Using Models, Exploring Factors, Gathering Lessons Learned, and Analyzing Patterns to Inform Online Teaching and Learning

10:35-12:05 Exploring Academic Success

12:25-1:55 Learner, Instructor, and Designer Roles

2:15-3:45 The Impact of Instructional Technology on Faculty, Curriculum, and Teaching

Tuesday 04/17

8:15-9:45 Poster Session

10:35-12:05 Learning and Development Through Engagement

12:25-1:55 Learning and Development Through Community Engagement and Group Learning

2:15-3:45 Scaffolding Self-Regulation, Co-Regulation, and Socially Shared Regulation of Future Learning: Affordances of Learning Analytics Dashboards

Traveling (time TBD)


I’m excited!


Since I’ve spent the past week on a research-free spring break, I’ll do a recap of the week before. Last week was pretty exciting! We got some responses to recruitment for new participants, so we scheduled participants for the weekend. We had one participant on Saturday and two on Sunday so I had two watching opportunities and one run where I ran the system myself 🙂

In our test runs, we were having some issues with the agent’s voice being incorrect, so we were very relieved that for our first participant we were able to get the correct voice working. Our first participant was great. They responded to everything the agent said and were able to verbalize their thought process for problems they were solving incorrectly. The only difficulty we had was in one problem where they had the right step, but they added incorrectly so their answer was slightly off.

The other two participants were a little awkward to work with. Observing the first one was especially painful because they pretty much didn’t respond at all. They also had a little prior algebra experience so they were flying through the problems before the agent could even ask how they thought they should start the problem. I ran the system for the second participant and it actually went a lot better than I expected. In practice runs I was struggling to find the correct response, but in actuality it wasn’t as difficult. I also had my mentor sitting next to me to make suggestions when I was too stuck and didn’t know what to say.

Overall, I’m really happy with the progress we made over the week, especially the progress I’ve made with WoZzing and the confidence I’ve gained in doing this procedure on my own. So over this week, I’m hoping to run the two new participants who have expressed interest. To get another test batch on Mechanical Turk, I also want to look through the slices that had heavy disagreement to see what we can modify about the assignment to get better IRR.

Last exciting update: my mentor and I are planning the details of attending a conference! I would be attending the American Education Research Association conference in April, which is in New York. This happens to be back-to-back with a conference I’m presenting at for my research with one of my psychology professors, so I have a very busy weekend in the future.


Do you ever do something once and it’s impossible but then the next time you try it you have no idea why it seemed so difficult in the first place? Yeah, that was me this week. I made a lot of progress toward my long term goal of running the WoZ system! Within the past few weeks I had successful recruitment endeavors, and with a handful of potential participants it was time to get familiar with the system so that I can eventually run it myself.

First and foremost, my mentor and I did a very basic trial run that was sort of a step-by-step process of everything to set up the system. Which is a lot. Programs are run on both a remote laptop (mine) and the desktop that the students use, and they are eventually connected so that the dialogue is controlled from the laptop and is sent to the desktop, along with the problem at hand. I had run the programs required for my laptop and had run the programs required for the desktop, but I had never run them together. The whole process for the first attempt took a little over an hour, which is pretty significant for just setup. I had a limited chance to try controlling dialogue, so that was something I needed to put more time into.

Throughout the week, I’ve also been contacting the participants to get times settled. Here’s one thing I suppose I have in common with middle-schoolers: my primary availability falls on the weekends. So at this point, all six of the participants who expressed interested in participating have been scheduled, and I’ll be able to attend for a majority of them! I’ll have the chance to be there for three of them, so I’ll have a chance to observe and run the system for an actual session.

Today I also had a chance to sort of run the system on my own (with my mentor sitting close by to troubleshoot any problems), so I had a solid half hour to run through the protocol for getting participants set up on the system and for actually running the agent in a fake session. Depending on how the student responds, I would choose the type of dialogue to reply with; if they got the step correct, for example, I would choose positive feedback and in the fixed social condition the response would be randomly chosen from a handful of potential positive feedback responses. I’m not entirely comfortable with responding, just because I’m not too familiar with the responding process and I’m worried about finding the correct response in time.

This week we have participants scheduled, so I’ll have a productive weekend running a handful of studies. From there, I’m going to try to recruit some more and possibly set up for a third MTurk test.