Igalia interview with Martin Robinson
Brian Kardell: okay, so, I am here with my coworker Martin Robinson from Igalia and we’re going to talk about accessibility. We do a lot of accessibility related stuff at Igalia, and you’ve been doing some of that. So, just to start us off - I think there’s kind of a interesting challenge with accessibility because it it’s just… like, really foreign to a lot of us in almost every way. Like, how you think about how something should work is very shaped by like your own modalities and what you’re used to using and I also think that nobody kind of explains the big picture like the architecture of how all of this works we just talked about like a screen reader and we don’t sort of talk about how it fits together. So, I’ve seen a lot of confusion over the years that have to do with not understanding some of this. I’ve even had some of my own, so, can you maybe explain some of the basic, like… architecture of it?
Martin Robinson: sure yeah no problem. I think that the most important thing to remember is that there’s always - at the end of… at the end of a pipeline of accessibility technology there is always a user using a specific tool and then the entire accessibility stack is made to deliver information from the application to that tool so the user can use the application - and one of the most important parts of that stack of technologies is this concept of the accessibility tree, which is an internal, in memory representation of the application - which allows the accessibility tool to interact with the application - and also to get information out from it. If people are familiar with… with, for instance a DOM tree - the accessibility tree can be thought of as something similar to that, but consuming the application interface. Is that sort of what you were going for there?
Brian Kardell: It is, yeah. And… when you say application you mean, like, any application.
Martin Robinson: Right. Yeah, an application like a web browser or a word processor or even the desktop shell you’re using will have a place in the accessibility tree. It will provide its own accessibility tree for the accessibility technology to consume.
Brian Kardell: Yeah that that’s really interesting to me because this is what I mean… Like, I think a lot of people including myself at various times… I’m not really sure why… I think it’s because on the web we we talk about accessibility a lot but I guess I have always had this very very strong correlation in my mind between something like a screen reader and a web browser and never have pieced together how those two things would.. “fit” together and thought about the rather obvious seeming fact in retrospect, that you use those screen readers for all of your applications like your native spreadsheet and your native email client.
Martin Robinson: Yeah that’s right.
Brian Kardell: So how how do those applications that aren’t web browsers… how do they create that accessibility tree, like…? Do they program it?
Martin Robinson: It really depends on the application. I think for the majority of applications if you can imagine that they’re written in a toolkit - like like gtk on on linux, for example… Or I… want to say cocoa but what I mean to say is that the modern Mac OS UI programming and then win32 controls on Windows… They all sort of have their own their own way of exposing accessibility objects to screen readers and other accessibility technologies typically typically this is handled automatically by the toolkit. So, if you write your application gtk and you’re not using custom widgets it should just by default be accessible. The problem of course is that a web page is a piece of interface and it’s unclear looking at any particular web page how that should be exposed to a screenreader or to a Braille display or something like that.
Brian Kardell: Yeah it’s…it seems pretty clear when you talk about, like… headings and paragraphs and things assuming that something is well-formed and follows good principles, but when it gets to the interactive things - this is very similar to the thing that you just said - like “if you’re not creating custom widgets”
Martin Robinson: Yeah it’s a it’s a similar idea. If your web page is just a document with headings and paragraphs then it’s going to be exposed really nicely to a screen reader. But once you start adding images or more complex interface elements a lot of web pages have these set of elements - just imagine Google Docs for example it’s an entire application inside of a web page - and how do you expose the idea of a menu or our buttons especially when they’re not made with a typical HTML form controls - to the screenreader? How do you provide that bit of contextual information.
Brian Kardell: I think that’s a point that we try to stress in the web platform itself: to write … like there are lots of interactive widgets that give you all that magic right there. The magic is a pro and a con. Because: you don’t have to worry about it. But, because you don’t have to worry about, it you’re also not aware of it so it’s easy to like miss that, right?
Martin Robinson: yeah, for example: the button. The button element has a lot of built-in accessibility support and if you, for instance, create your own button just from a div or from an image or text then it may not it may not be exposed in the same way to the accessibility technology as that original HTML form button - and that’s the kind of thing that you would miss by creating your custom your custom elements.
Brian Kardell: it seems so easy. Like.. I mean.. what is a button? I don’t know it’s like “a thing that you click”… Like, at some level that’s true, but there is this sort of bunch of hard work in… that’s wrapped up in a button and you get it all for free if you just use a button
Martin Robinson: Right, yeah.
Brian Kardell: Let me like, recap what I think is a simplified understanding your apps either directly or through some windowing toolkit where somebody has done a whole lot of hard work to build it builds and maintains this OS level tree sort of like a DOM tree but for accessibility and it’s not in the browser, it’s at the OS level. And then some other app, like… consumes that and interacts with that. It’s almost like like client-server communication
Martin Robinson: Yeah that’s correct. The… Generally speaking, the way this works is that applications will expose the tree through inter process communication to some sort of desktop white broker that has all of these all of the trees of all the different applications - and then those can all be presented at once perhaps through another set of inter process communication to some sort of client application a screen reader, a Braille display, or maybe an on-screen keyboard and - and that application will be consuming the trees through IPC of all the other of all the applications of the desktop… a forest.
Brian Kardell: Alright, so… You’re saying through IPC - does it go in both directions? So, like, can your accessibility app move the focus in this application or something?
Martin Robinson: For sure. That is another huge huge responsibility of the accessibility tree - is to accept requests from accessibility technologies for - like you said moving the focus. Another big one is activating. So if you have a link or a button, and the user wants to activate that that button then the accessibility technology sends a message to the application which, in in this world is considered a server, the application is serving up the accessibility tree and taking requests from clients which are the accessibility technologies.
Brian Kardell: I feel like that kind of helps illuminate the some of the stuff that’s hidden from you…. Because, you look at it and you say “well what is a button? it’s just the thing that you can click like what’s so special about it?” – but it needs to create this accessibility tree and also like - receive events.. And so, like, sort of the contract of these things is considerably more involved than you think and if you don’t meet some end of the contract something won’t work the same.
Martin Robinson: No for sure.
Brian Kardell: All right - So… you had mentioned screen readers I think they’re the one that like everybody talks about. Our colleague Joanie and Igalia are the maintainers of one of those: the Orca screen reader for Linux.
Martin Robinson: Mm-hmm.
Brian Kardell: I think that’s awesome - and we do a lot of that as an investment – like… we altogether choose to take some of the money that we could put in our pockets and we all contribute it back
Martin Robinson: mm-hmm that’s right!
Brian Kardell: I think that’s really awesome - I mean… right?
Martin Robinson: Yeah, no for sure. That’s kind of part of core… Our core philosophy, I think
Brian Kardell: Yeah… you know this is the first time I have had the opportunity to work directly with people who like, work on the screen readers and like down in the ditches and like… do all the actual work on making that work - and it never dawned on me… it was kind of surprising for me to learn that… like…screen readers don’t actually… speak.
Martin Robinson: [chuckles] no. You’d think that they would read the screen but no, from the name, but no.
Brian Kardell: That’s right… right, yeah that’s… do you want to like say something about it… like, clear that up somehow?
Martin Robinson: Sure. Yeah, I think it’s… it’s important to note that screen readers screen readers use speech… speech subsystems… to actually produce the actual audio of the spoken content. So on Linux, for, example Orca uses a software called speech dispatcher by default although there are other speech backends… And then there are similar speech backends on other platforms. And, essentially Orca can just send the text that it wants to speak to these api’s and they’ll actually produce the audio output so text-to-speech is another it’s a whole other set of software and libraries the speech the screen reader sits on top of.
Brian Kardell: That… yeah that’s that’s another thing that’s really fascinating to me. I know: you think it’s your browser speaking but it’s it’s not a browser speaking, right? It’s the OS level thing that knows how to speak. Yeah, okay so… there’s this mystical accessibility tree at the operating system level and we said it’s kind of like the DOM but like what’s in it like that maybe sounds like a strange question but as I understand a lot of the concepts in Orca were developed like based on these accessibility trees and that’s that’s another thing - like… are they… they’re not are they the same tree in every operating system - or do they actually differ a little bit?
Martin Robinson: no they differ. They differ. They’re very similar but different platforms have different expectations of what should be exposed into these into these desktop wide trees so there is some degree of of difference. I would say the vast majority of the contents are are quite similar. An example of what sort of thing is in the tree is… is, for example, you look at it’s just imagine a tree of nodes and each node can have children and then certain properties on that node one probably the important property of the node is the role and the role basically gives some indication of what the what the purpose of that node is for example there might be a role label for labels role button for buttons. In ATK there’s a role panel for generic panel elements and then that kind of tells the screen reader what the what the purpose of that… of that node is. Another thing that each node might contain our states: Is this item focused? is it selected? and then also attributes. For example, you can imagine a DOM attribute which include things like ID or style classes. The accessibility nodes also have attributes.
Brian Kardell: yeah that sounds a lot like ARIA. My understanding is that ARIA itself was sort of shaped based on these common concepts in different operating systems and sort of imagining how to provide a uniform way to express all this for the web platform with standard mappings to each one… Have I… Have I got that right?
Martin Robinson: That’s correct. So, at least in my head, ARIA can be thought of as a platform independent version of platform except accessibility tree and then can later be mapped on to the real platform accessibility tree there’s a there’s actually a series of specs that kind of track this this layering. The first one is WAI ARIA, which defines ARIA and it doesn’t say anything about how it how it maps onto the platform apis - but it sort of defines the basic concepts of what that kind of API looks like with roles and attributes in States and those sort of things. And then other specs will map that on to platform APIs. So, if you look at Core-AAM which is the core accessibility API mappings - that spec actually tells you: well, on this platform the this sort of ARIA node maps on to this sort of platform node. And then if you look further you see that there are other specs such as HTML-AAM which is the HTML accessibility API mappings. And that sits sort of on top of both of those specifications and what it does is it says: okay here is an HTML element and implicitly it maps to this ARIA role and right beside it it says it also maps to this platform accessibility tree role so it sort of tells the whole story by referring to both of those specs so it really helps sort of stitch them together in, in these layers.
Brian Kardell: yeah there’s a lot of accessibility specs. There’s also the ARIA authoring practices which is great if you are a developer because ARIA also contains, like, patterns for controls that have no current parallel in HTML like tab sets and accordions
Martin Robinson: Yeah, that’s right. Where the HTML-AAM provides implicit mappings for already existing HTML elements, if you want to create your own more complicated pieces on on your on your page then the ARIA authoring guide is a great way to see how some of those should be represented in ARIA.
Brian Kardell: Ok, so the browser is just another one of these apps - and it has to build and map its accessibility tree and that’s complicated: You have ARIA and the mappings to the operating system-level thing but then to create that you… you can’t even just look at the DOM, right? Like, it’s more it’s more complicated than that, right?
Martin Robinson: That’s… that’s right. If you take the example of CSS generated content for instance - ::before. Using ::before to produce content that goes before a particular CSS selector content selected with this particular CSS selector: that content isn’t in the DOM - but it is useful for it to be exposed in the accessibility tree. So, essentially the accessibility tree needs to be built based on the combination of the DOM with the CSS
Brian Kardell: display:none - right? Like, if it’s display:none it we can’t have it be in the tree or else that would be bad – and a number of things are implemented like effectively that way, right? Like: they’re just hidden from display. You wouldn’t want your document to read the head of your HTML - like you don’t want it reading the text of script elements!
Martin Robinson: no, yeah - right.
Brian Kardell: alright… so… we work on all kinds of stuff in this space and in more than one browser, right?
Martin Robinson: Well, looking at the specifications - we’re involved in in many of the specifications - in the editing process…
Brian Kardell: Mm-hmm
Martin Robinson: we’re also involved in the implementation - focusing mainly on the Linux desktop but also with other desktops as well. A lot of our work is focused though on the Linux desktop - implementing the accessibility support for ATK and ATSPI2.. We’re working in the three major browsers: chromium, gecko/Firefox and webkit.
Brian Kardell: When it comes to accessibility on Linux it would be safe to say that Igalia is a strong strong ally.
Martin Robinson: yeah I think that’s safe to say.
Brian Kardell: kind of looping back — like as I said in the very beginning it’s… it’s very hard for a lot of us to like… even relate to how things should work. Like, I’m not the first person to say that if you just give like a developer who has never experienced a screen reader before and ask them like “your site isn’t accessible - here’s a screen reader go fix it.” Like, a lot of their intuitions turn out to be wrong and they wind up making it worse instead of better. Even if you really know what you’re doing ARIA can be pretty complicated.
Martin Robinson: right.
Brian Kardell: that’s why I think the first rule of ARIA is: Don’t use ARIA… Can you maybe talk about that a little bit or something?
Martin Robinson: yeah so I guess I can point to an example of of how this is this is tricky to get right… we’ve run into issues where people have tried to create something as simple as a button and they perhaps create an element - an HTML element - and give it the button role but then they don’t allow it to respond to… not responding to keyboard events… so… say that you… you create a button where if the user is able to to mouse over that button and click it - it works just fine… but it’s not focusable so it’s, uh…it’s impossible to actually activate it with the keyboard. screen readers - at least Orca - is smart enough to to notice when that’s happening and it will try to to click on it but only if everything is set up properly on the client in the client code… so once you start going down these paths it’s really easy to get into a situation where the screen reader is just not going to know how to operate on your custom element and that’s the kind of trouble you can get into when you start using Orca… sorry… using ARIA without without setting up the elements in a proper way. The… the thing is, nowadays screen readers are very similar to two web browsers in that they they’re… they’re built to deal with bad authoring… and they have a lot of workarounds to make that… to make that happen… To some extent, that’s sort of the life that you that you take on when you work on a screen reader. You… you have to deal with the web that exists not the one that you… you really want to. So… I think from the from our perspective as implementers things like that. It’s just that if there’s something that you have to work around you have to work around it because - at the end of the day it has to… the software has to work.
Brian Kardell: if the first rule of ARIA is don’t use ARIA and it can be complicated… like… who should use ARIA? The way I’ve looked at this in the past is that ARIA is low-level things and they’re not (generally) for the average author. The goal of the platform is to make it so you don’t have to use that and ARIA is then like an escape hatch for when the platform isn’t meeting your needs
Martin Robinson: I think that’s a good a good way of describing it.
Brian Kardell: So, like in in some ways - and in theory anyway - it’s really good because different people can sort of centralize their work and they can make like a components toolkit of components that have all these accessibility characteristics and that’s great but we also don’t have native equivalents of some of those and the ones that we have as we say like people have to turn away from them too quickly… so part of me is wondering if you have any thoughts on like… how did it get that way? like… how do we wind up here?
Martin Robinson: there are these sort of two worlds - the web platform in implementers and then web developers and it’s really surprisingly a rare that there’s a big overlap between those two… so a lot of times it’s really difficult to know exactly what web developers need or what they’re missing which is why I think this communication between these two groups is really important.
Brian Kardell: that’s a really hard thing to resolve and I bring it up because I wanted to mention some new efforts at this and get your thoughts on them… so I don’t know if you’re aware but Nicole Sullivan at Google and Greg Whitworth at Microsoft are sort of spearheading some efforts to study why people don’t use those elements… we know plenty of reasons already, but… and how to fix that in order to make form controls more styleable, and things like that so that people don’t find an excuse to go pop the escape hatches right away… so I don’t know have you seen that? what do you think about it?
Martin Robinson: I haven’t seen that, but, I mean that sounds great. I feel like that would go a long way toward toward relieving some of these situations that pop up.
Brian Kardell: yeah. There’s… there’s lots of things in the platform where you currently have to turn to escape hatches and also those escape hatches are like… a pretty comparatively low level. There’s even sort of intermediate things that we can do to greatly improve like, the complexity of a whole bunch of use cases and make it easier to get things right that aren’t even just components… Things that make components possible and easier to develop.
Martin Robinson: yeah at the end of the day if it requires a huge development team of experts to get accessibility right in order to build a web application then we as providers of the web platform need to do a better job of making it really easy to get accessibility right. Even when you’re building these custom widgets if they’re actually required for for building modern web applications. Yeah, so… I think you’re totally right that the responsibility for making this work really lies with us and not with millions of web developers who don’t have years of experience doing accessibility.
Brian Kardell: I really really like the sort of nuance and what you’re saying. A lot of times, like, historically these communities are sort of like at odds with one another - like the design community and the accessibility community.. Our friend Alice Boxhall who works at Google - she is on the w3c tag now - and in her candidate statement - it stuck with me… she said, like “developers… you know… they want their things to be accessible. it’s not that they don’t want them to be accessible, right? it’s just currently very hard and if we make it hard to do the right thing very few people will do the right thing… so we have to find ways to lower the barrier.
Martin Robinson: yeah… yeah, totally.
Brian Kardell: … But have you had any thoughts around… like… or did you have any thoughts on, like, what elements should HTML have that it just doesn’t have and that makes it really difficult for a lot of authors?
Martin Robinson: I mean I think from the perspective of accessibility the most important thing is that when we add things there has to be from the beginning a story around how they interact with accessibility technologies and the days of creating controls without actually taking into consideration screen readers and people using them that just has to be in the past… because the web is too big and too important to leave people behind. so… so… I think that large corporations proposing things - they they really need to… they really need to make their proposals with those considerations.
Brian Kardell: Yeah, I think getting people to agree on any of these things is like… really, really hard and historically this has been like… really based on lots of talk and debate and at the end of a really long process we give people something, and the question is: did we get it right? okay… so, um thanks for taking the time and talking with me Martin. I appreciate it.. I learned a lot in this actually and I found it sort of informative and useful, so thanks, thanks.
Martin Robinson: thanks yeah my pleasure