Abstract
The modular control hypothesis suggests that motor commands are built from pre-coded modules whose specific combined recruitment can allow the performance of virtually any motor task. Despite considerable experimental support, this hypothesis remains tentative as classical findings of reduced dimensionality in muscle activity may also result from other constraints (biomechanical couplings, data averaging or low dimensionality of motor tasks). Here we assessed the effectiveness of modularity in describing muscle activity by tackling typical limitations in the experimental and computational design, and testing two essential predictions: (i) muscle patterns must be decomposable onto invariant modules and (ii) module recruitment must determine the task at hand. We designed a comprehensive experiment comprising 72 distinct point-to-point whole-body movements for which the activity of 30 muscles was recorded. To identify invariant modules, we used a space-by-time decomposition of single-trial muscle activity that has been shown to encompass classic modularity models. To critically test the decompositions, we examined not only the amount of variance they explained but also whether the direction of the movement performed on each trial could be inferred by the single-trial module activations. Overall, the modular decomposition was more effective than non-modular alternatives in terms of data approximation, direction discrimination and dimensionality reduction. These findings show that few spatial and temporal modules give a compact approximate representation of muscle patterns carrying nearly all task-relevant information of a variety of whole-body reaching movements.
1 Introduction
Human motor control has been hypothesized to rely on a modular organization of muscle activity (the so-called muscle synergies or motor primitives) since Bernstein (1967) and the seminal works of Bizzi et al (1991). This hypothesis postulates that the central nervous system (CNS) exploits modularity as a simplifying mechanism to generate and control the many neuromusculoskeletal degrees of freedom underlying voluntary movements (Flash and Hochner, 2005; Bizzi et al, 2008). Over the past years, there has been substantial evidence supporting the hypothesis (Berniker et al, 2009; Overduin et al, 2012; Nazarpour et al, 2012; Berger et al, 2013) as well as contradicting (Kutch et al, 2008; Valero-Cuevas et al, 2009) or questioning it (de Rugy et al, 2013; Zelik et al, 2014; Inouye and Valero-Cuevas, 2016). As this hypothesis is in fact difficult to falsify (e.g. Tresch and Jarc, 2009), it is important to develop approaches that critically test its fundamental predictions (Kutch and Valero-Cuevas, 2012; Delis et al, 2013b; Berger et al, 2013). Here our rationale is that an effective modular decomposition of muscle activations must allow not only the formation of genuine muscle patterns but also the distinct characterization of a variety of movements. On the one hand, modularity models are typically assessed based on their ability to approximate the recorded electromyograhic (EMG) data. However, an absolute expectation on the EMG data reconstruction (quantified as variance accounted for, VAF) is arbitrary as high VAF values may also result in overfitting, i.e. the resulting decomposition may contain modules that explain task-irrelevant variance in the EMG recordings, which can be considered “noise”. In other words, VAF measures focus on approximating EMG activations and ignore the task-relevance of the resulting representations. On the other hand, a plausible modular decomposition must allow mapping a desired movement onto an adequate activation of modules. The single-trial recruitment of these modules must, in turn, lead to unequivocal task realization (i.e. hierarchical control hypothesis; Todorov et al, 2005). Hence, a necessary condition is that distinct movements should be discriminable from the way modules are recruited on single trials. Without this property, consistent recruitment of modules could lead to execution of unintended motor tasks, thereby invalidating the hypothesized control scheme. Hence, movement discriminability constitutes a strong falsifiability test of the effectiveness of modular decompositions, which is complementary to the standard VAF evaluation. Therefore, we propose assessing modular representations of muscle activity not only in input space as usually done (EMG data reconstruction) but also in task space (motion direction discrimination here) (Delis et al, 2013b; Alessandro et al, 2013b; de Rugy et al, 2013).
In this study, we test how effective a modular description of muscle activations is by comparing it with non-modular structures in terms of these two prerequisites. To limit classical shortcomings, we combine a) a highly comprehensive experiment with b) a unifying modularity model of EMG activity (Delis et al, 2014). Our experimental design comprises surface EMG recordings from a large number of muscles (30) spread across the human body on both hemibodies. Importantly, muscle activity is recorded during performance of a large number of whole-body pointing movements (72 distinct motions) in the 3-dimensional space (Stapley et al, 2000; Leonard et al, 2009). This protocol imposes no further constraints and spans a wide range of movements requiring whole-body coordination including upper and lower limbs while preserving equilibrium. Furthermore, multiple repetitions (30) of the movements are recorded, which allows capitalizing on single-trial variability and assessing its consistency with the modular control hypothesis that must enable adequate task execution not only “on average” but on individual trials. Based on these specifications, we collect a remarkably large EMG dataset, on which we can evaluate whether modular structures are effective in i) yielding a parsimonious yet accurate description of EMG patterns and ii) ensuring reliable characterization of distinct movements. To investigate the former, we apply a generic model of modularity, named spaceby-time decomposition, which assumes the concurrent existence of spatial and temporal modules (Delis et al, 2014). The use of such a unifying model is crucial for limiting the dependence of conclusions upon the decomposition model used, as temporal, spatial or spatiotemporal modular decompositions have been assumed separately before (Bizzi et al, 2008; Ivanenko et al, 2005; d’Avella et al, 2006). To tackle the latter, we employ a single-trial task decoding analysis (Delis et al, 2013b,a) to assess how well module recruitment maps onto movement identity, which would be impossible or meaningless if trial-averaged data or a limited set of tasks were considered. Ultimately, we apply these analyses to compare the modular decomposition with non-modular alternatives in terms of their ability to compactly and unequivocally describe muscle patterns and motion directions. This assessment allows us to critically evaluate the effectiveness of the space-by-time modularity model by attempting to falsify it (Ajemian and Hogan, 2010). In other words, if its motion direction discrimination or data approximation power are significantly smaller compared to equivalently parsimonious non-modular descriptions of individual muscle patterns, this would cast serious doubts about the pertinence of the space-by-time modularity, otherwise it would prove it is a useful descriptive model compatible with the modular control hypothesis.
2 Materials and Methods
2.1 Experimental procedures
Subjects
Four healthy participants (2 males and 2 females, aged = 25 ± 3 years old, height = 1.72 ± 0.08 m, weight = 70 ± 7 kg, all values presented hereafter refer to mean ± s.e.m.) voluntarily agreed to participate in this study and performed the experiment. None of them had any previous history of neuromuscular disease. All subjects were made aware of the protocol, and written consents were obtained before the study. Experimental protocol and procedures were approved by the Dijon Regional Ethics Committee and conducted according to the Declaration of Helsinki. As the study focused on intra-individual analyses, few subjects were included in the study.
Motor task
Participants were asked to execute whole-body point-to-point movements in various directions at a self-selected pace. The experimental protocol (illustrated in Figure 1) specified 9 targets on 3 vertical bars. Subjects stood barefooted and performed pointing movements between all pairs of targets, termed motion directions throughout the paper, using the index fingertip of their dominant arm (right) while standing (i.e. a total of 72 different pointing movement directions). No constraint was imposed on the left arm. Each participant repeated each motion direction 30 times for a total of 2160 recorded movements per participant. Given the large amount of movements, we separated the whole experiment in two sessions (approximately 2 hours for each session) to avoid participants’ fatigue, with 24h between the two visits. Movements were pseudo-randomized within each session in order to ensure same number of trials for each of the 72 motion directions (15 on day 1 and 15 on day 2). We marked electrode placement on subjects’ skin to limit measurement noise due to recording position changes. As reported previously, EMG recordings from different days yield highly similar modular structures (Santuz et al, 2016). Here, we also verified that the removal of electrodes between the two sessions did not critically affect the EMG recordings as well as the identified modules. We found a highly significant mean correlation coefficients of 0.89 ± 0.09 for spatial modules and 0.99 ± 0.01 for temporal modules (p<0.001) between the two recording sessions of each subject, which shows that the extraction method was robust and that a single extraction including both sessions could be performed. We therefore present the extraction with the 30 repetitions in the Results section.
Kinematics and EMG recording and preprocessing
We recorded 30 muscles by means of an Aurion (Milan, Italy) wireless surface electromyographic system. The skin was shaved before electrode placement, and abraded softly. EMG electrodes were placed symmetrically on the two sides of the body on the following muscles: tibialis anterior (Ta), soleus (So), peroneus (Pr), gastrocnemius (Ga), vastus lateralis (Vl), rectus femoris (Rf), biceps femoris (Bf), gluteus maximus (Gm), erector spinae (Es), pectorialis superior (Ps), trapezius (Tz), anterior deltoid (Da), posterior deltoid (Dp), biceps brachii (Bb), triceps brachii (Tb). These muscles were chosen because they are involved in whole-body reaching, and importantly, they not only cover a large part of the human body but they are also easily recordable via a surface-EMG systems. Correct electrode placement was verified by observing the activation of each muscle during specific movements known to involve it (Kendall et al, 2005). During this procedure, EMG signals were monitored in order to optimize recording quality and minimize cross-talk from adjacent muscles during isometric contractions. The 3D positions of 20 retroreflective markers (diameter = 20 mm) were simultaneously recorded using an optoelectronic measuring device (Vi-con Motion System, Oxford, UK) at a sampling frequency of 100 Hz. 16 passive markers were fixed symmetrically on the two hemibodies (acromial process, humeral lateral condyle, ulnar styloid process, apex of the index finger, greater trochanter, knee interstitial joint space, external malleolus, and fifth metatarsal head of foot). We added external cantus of the eye on the right face, auditory meatus on the left, and head apex and the first thoracic vertebra (T1) at the middle. The kinematic data were low-pass filtered (Butterworth filter, cut-off frequency of 20 Hz) and numerically differentiated to compute tangential velocity and acceleration of each marker. To restrict our analysis to movement-related activity, we defined movement onset (to) and end (tend) times as the beginning and end of a time interval in which the fingertip velocity was continuously above 5% of its maximum, and which contained this maximum (Delis et al, 2013b). This time interval captured the principal EMG signal variations related to the considered conditions. Figure 1 (right-down panel) shows movement kinematics (initial and final posture as well as fingertip trajectories) and (both raw and filtered) EMG signals for one pointing condition T1-T9 (diagonal movement from top right to bottom left). Main results were qualitatively the same when defining trials from to-100ms to tend-100ms to account for the electromechanical delay between EMG activity and real force production. The EMGs for each trial were rectified, low-pass filtered to obtain smooth envelopes of EMG activity (Butterworth filter, cut-off frequency of 3Hz, zero-phase distortion;Ivanenko et al, 2004) and normalized to 1,000 time steps. A final waveform of 50 time steps was then obtained by using trapezoidal integration of the latter signal on a uniform temporal grid. Movement artifacts were visually removed by discarding the associated trials (<2% of the total number of trials). The data were then normalized in amplitude on a muscle-per-muscle basis by dividing each single-trial muscle signal by its maximal value attained throughout the experiment. A potential detachment of EMG electrodes was assessed, for each subject, by visually checking a posteriori that none of the recorded muscles showed an abnormal change in signal amplitude across trials. For each subject, we finally formed an EMG matrix of (50 time steps × 30 muscles) in rows and 2160 trials in columns consisting of all the movement-related EMG activity (rectified and filtered) of the 30 muscles for all recorded trials. This matrix was used as input to the modular decomposition algorithm to characterize the spatial and temporal structure of muscle activations for this set of movements.
2.2 Space-by-time modular decomposition of muscle activity Space-by-time decomposition model
To represent muscle activity as a structured modular decomposition, we used a space-by-time decomposition model (Delis et al, 2014). This modularity model decomposes muscle activity in separate but concurrent spatial and temporal modules and combines them in single trials using scalar coefficients in order to approximate the recorded EMG activity. More formally, according to the space-by-time decomposition, a single-trial muscle pattern can be written as follows (T and M being the number of time frames and muscles, respectively): where and are the temporal and spatial modules respectively, and is a single-trial scalar activation coefficient. The free parameters P and N correspond to the number of temporal and spatial modules, respectively, and are set by the user.
Variance accounted for (VAF)
To assess how well the space-by-time decomposition approximates the recorded EMG activity, we computed the variance accounted for (VAF) by the spaceby-time decomposition. VAF is defined as the total reconstruction error normalized by the total variance of the dataset as follows: where is the mean muscle activity across all trials, time steps and muscles and ║. ║ denotes the Frobenius norm. Note that different VAFs could be defined by replacing by zero or any other reference value (Torres-Oviedo et al, 2006), indicating that VAF values may differ depending on the precise definition of .
Module extraction
To extract the space-by-time representation of muscle activity in the set of movements under consideration, we applied sNM3F, a Non-negative Matrix Factorization (NMF) based module extraction algorithm that implements effectively the space-by-time decomposition and identify meaningful spatial and temporal modules (Delis et al, 2014). The advantage of NMF-based decompositions is that they restrict the extracted modules and activations to be non-negative, which makes them physiologically relevant for EMG signals reflecting well the “pull only” behavior of muscles (i.e. muscles cannot be activated “negatively”). We input the preprocessed EMG matrix (see above) of each subject to sNM3F and extracted P temporal modules, N spatial modules and P × N × S activation coefficients capturing all the variations across trials and movement directions. The numbers of spatial and temporal modules (P and N respectively) are free parameters of the algorithm, thus we varied P = 1, …, 10 and N = 1, …, 10 and computed the decomposition for all the 100 possible (P, N) pairs. The smallest set of modules describing best the data was then estimated using a decoding analysis (see Module selection and clustering).
2.3 Direction decoding analysis
We also tested whether the identified decompositions allowed discriminating the 72 distinct motion directions specified by the experiment. With such a large number of directions to discriminate, achieving high decoding performance may be challenging (chance level is very low, equal to 1.4%). We used the single-trial activation coefficients to decode the direction of the performed movement in each trial by means of a linear discriminant analysis (LDA) in conjunction with a leave-one-out cross-validation (Delis et al, 2013b). We quantified decoding performance as the percentage of correctly decoded trials and reported results in the form of confusion matrices. The values on a given row i and column j of the confusion matrix C(i, j) represent the fraction of trials on which the executed motion direction j was decoded as the direction i. If decoding is perfect, the confusion matrix has entries equal to one along the diagonal and zero everywhere else.
2.4 Module selection and clustering
Selecting the number of modules
We used the decoding analysis described above to identify the most compact and direction-discriminative space-by-time decomposition. Decoding performance was computed step by step by increasing gradually the numbers of temporal and spatial modules extracted (P, N respectively). The number of modules was then selected as the step at which adding a supplementary module did not give any significant decoding gain (p > 0.05). To assess the significance of decoding performance, we employed a permutation test where we randomly shuffled the coefficients corresponding to the added module (while the distributions of all other coefficients were unaffected) and computed discrimination performance. Hence, this procedure ensured the detection of modules that explain the “task-relevant” variability and the exclusion of other sources of noise (“task-irrelevant” variability) (Delis et al, 2013b,a).
Clustering analysis
To compare modules of the same type (spatial or temporal) extracted from different subjects, we grouped them using an agglomerative hierarchical cluster analysis (Hastie et al, 2009). Although it was not crucial for the present study, such a clustering can be useful for visualization purpose and for comparison of our results with other studies. In particular, it is worth mentioning that the modular control hypothesis does not impose that different subjects must have the same modules but that states that each subject may rely on its a modular structure to generate genuine muscle patterns. In the following, we will present the procedure used for clustering in detail for spatial modules, but the same procedure was followed also for clustering the temporal modules. We quantified the similarity between spatial modules i and j as their correlation coefficient (Ri,j). We considered spatial modules as M-dimensional vectors and computed correlation coefficients between all pairs of modules across all pairs of subjects. Using Ri,j as distance measure, we created a hierarchical cluster tree from all module pairs (Matlab function “linkage” with the “average” distance method, i.e., using as distance between two clusters the average distance between all pairs of objects across the two clusters). The number of clusters was set to the maximum number of spatial modules across subjects (i.e. 7 here). We thus ensured that the resulting clusters included one or zero modules from each subject. The correlation between modules was then computed as the mean pairwise correlation between all pairs of modules within each cluster.
2.5 Significance of identified decompositions
We used a permutation test to assess the ability of the identified decompositions to uncover meaningful structure in the data. We compared the VAF and decoding performance of the identified decompositions with the VAF and decoding performance values obtained when decomposing structureless data. We generated structureless data from the recorded data by randomly permuting the muscles for each time step of each trial in every movement. The input matrix thus had exactly the same numerical values but was devoid of biomechanical significance. For each subject, we performed 10 different permutations, which resulted in 10 simulated datasets on which we applied sNM3F to extract space-by-time decompositions and computed VAF and decoding performance of the resulting decompositions. We considered as significance level the maximum of the VAF and decoding performance obtained for these decompositions. Quality of the VAF and decoding performance obtained for the recorded data was then evaluated relative to this significance level.
2.6 Comparison with non-modular muscle pattern descriptions
To compare the efficiency of the extracted modular decompositions with non-modular alternatives, we computed decoding performance and VAF of non-modular descriptions of the data with equal number of parameters as the modular decompositions. In particular, we examined whether alternative descriptions of muscle activity that do not rely on an explicit modularity model are more or less effective than the space-by-time decomposition in a) discriminating the performed motion direction and b) approximating the EMG signals. This analysis also served to investigate whether a subset of the recorded muscles or a shorter temporal window of muscle activity suffices for the characterization of a) the recorded EMG signals and b) the direction differences in single-trials. Thus, we compared direction decoding and VAF results of the space-by-time decomposition with those obtained by artificially reducing the spatial dimensions (i.e. choosing a subset of muscles) or the temporal dimensions (i.e. splitting muscle activity into shorter temporal windows) or a combination of the two. To this aim, we divided the muscle activity of each of the M muscles (spatial dimension) into B bins (temporal dimension) and computed the root-mean-square (rms) of the EMG signal of each muscle within each temporal bin. This procedure yielded M × B parameters for each trial, which we refer to as “non-modular” parameters. To perform a fair comparison, we then matched the number of non-modular parameters with the dimensions of the space-by-time decomposition, i.e. selected an identical number of parameters for both descriptions, in three different ways.
Firstly, we examined whether adding more spatial dimensions (and ignoring the temporal structure of the data) would enhance direction discrimination performance. Thus, the first set of non-modular parameters described only the spatial dimension of muscle activity by varying the number of muscles retained (M = N × P) and keeping only one temporal dimension (B = 1). Secondly, we examined the effect of adding more temporal dimensions (and ignoring the spatial structure of the data). Thus, to obtain the second set of non-modular parameters, we varied the number of temporal bins (B = N × P) and kept only one spatial dimension (M = 1). Thirdly, we selected equal numbers of spatial and temporal dimensions with the space-by-time decomposition (M = N, B = P) and asked whether we could achieve higher direction discrimination using the non-modular parameters instead of the modular parameters. We repeated parameter selection for each of the three sets 20 times (by randomly selecting muscles and/or bins, when appropriate). We used these three sets of parameters to compute the decoding performance of the non-modular muscle activity descriptions and compare it with the decoding performance of the space-by-time decomposition. Note that the VAF could not be evaluated from some these non-modular decompositions because they involve only a subset of the recorded muscles; thus, they can reconstruct only a part of the EMG recordings.
To resolve this issue, we then quantified the maximal decoding and maximal VAF that can be achieved by the dataset under investigation using non-modular descriptions of the data involving all recorded muscles. We described EMG activity in single trials using the rms values from the single-trial recordings of all M = 30 muscles, binned into increasing numbers of temporal bins (B varying from 1 to 50). We input these values to LDA and computed the maximal “non-modular” decoding performance (for B = 50 the non-modular description was identical to the one given by the original data set). Computing the VAF was also possible here, using the following procedure: the reconstructed EMG matrix of this non-modular decomposition was obtained by assuming all time points within each bin to be equal to the rms value of that bin (i.e. a piece-wise constant function). The resulting reconstructed data matrix for each trial had equal dimensions as the original single-trial EMG data matrix ms(t) and was defined as follows: where is the indicator vector function that is equal to 1 on the ith component if t belongs to bin j and to 0 elsewhere, and is the rms value of muscle i for bin j and trial s. Hence, the VAF of the non-modular decomposition can be computed from Eq. 2 by replacing the double sum term by .
2.7 AIC computation and comparisons
As the different numbers of parameters of the modular and non-modular decompositions may affect their performance in data approximation, we also compared them by computing Akaike’s information criterion (AIC, Akaike, 1974) for each decomposition. The AIC is a measure of the goodness of fit, based on the likelihood, that includes the decomposition’s degrees of freedom and constitutes a numerical implementation of Occam’s razor. This measure thus allows a more objective comparison of decompositions with different numbers of parameters. Lower values of AIC correspond to decompositions that achieve a better trade-off between dimensionality reduction and data approximation. Assuming equal variances across all S trials, the AIC can be computed as (Burnham and Anderson, 2002): where k is the number of parameters of the decomposition, S is the number of trials and Err is the reconstruction error of the decomposition. For the modular decomposition whereas for the non-modular decomposition . Regarding the number of parameters that are required by each decomposition to reconstruct complete time-varying muscle patterns, we have k = N × M+ T × P × N × P for the modular decomposition and k = M × B+ T × M × B for the non-modular decompositions, where M × B is the number of rms values and T × M × B corresponds to the number of parameters to define indicator functions.
3 Results
3.1 Basic kinematic features
The four subjects performed 72 different point-to-point movements between pairs of targets among 9 predefined target points. The movement duration ranged, on average across subjects, from 1.05 s±0.18 s (for T3-T6) to 1.64 s±0.30 s (for T9-T1). All finger velocity profiles appeared to be bell-shaped across subjects and conditions, as previously observed in whole-body reaching movement (Thomas et al, 2005; Berret et al, 2009). Targets were attained with an overall mean spatial error of 10 mm±2 mm (from 8 to 15 mm for T6-T3 and T9-T5 respectively). Raw EMG data, associated with motion direction T1-T9, for a typical trial of subject S2, are shown in the Figure 1. These recordings for all conditions and trials formed the EMG matrix that was used for module decomposition.
3.2 Low-dimensional modular decomposition in space and time
We extracted a space-by-time representation of muscle activity by applying the sNM3F algorithm to the EMG recordings of each subject. Figure 2 illustrates the VAF and decoding performance graphs (upper surfaces in all plots) as a function of the number of spatial and temporal modules for one typical subject. These graphs provide insights about the number of spatial and temporal dimensions that are necessary to describe the set of motion directions at hand. For all subjects, VAF exhibits a smooth increase with the number of temporal and spatial modules with no clear saturation point. In contrast, direction decoding performance grows quickly and reaches a plateau for all subjects. The 3D decoding graphs show a larger effect of the spatial dimension on decoding compared to the temporal one indicating that precise muscle groupings may carry more task-related information than precise timing of muscle activations. The sets of temporal and spatial modules were selected as the dimensions of the space-by-time decompositions for which no statistically significant gain (p < 0.05) in decoding was obtained when adding more (spatial or temporal) modules. In particular, four temporal modules (P = 4) appear to carry all decoding power for all subjects, whereas the number of spatial modules varies across subjects and is usually higher (S1: N = 4, S2: N = 6, S3: N = 7, S4: N = 5). The resulting decompositions achieved on average across subjects (mean ± sem) a VAF value of 68%±5% and decoding performance of 86%±1%. The corresponding graphs for all subjects are presented in Figure 2. VAF values may appear relatively small compared to other studies, especially if one sets a somewhat arbitrary threshold such as 90% for selecting the number of modules (Torres-Oviedo et al, 2006; Hart and Giszter, 2004; Ting and Macpherson, 2005). This discrepancy is partly due to the fact that data were not averaged across trials here (for comparison, the VAF obtained from averaged data was 78%±7%, see Discussion for more details on these VAF differences). We assessed the statistical significance of these VAF values by performing a permutation test (see Methods for details on this computation). The lower surfaces in each plot of Figure 2 represent significance levels for VAF and decoding values for unstructured data, which we compared to the ones obtained from the space-by-time decompositions. For the selected number of modules, significance level for VAF is 9%±3% and for decoding performance 19%±5% across subjects. Note that for decoding significance level is higher than theoretical chance level (~1.4%) because our permutation technique preserved the order of trials and motion directions (only muscles were shuffled for each time step). Overall, VAF and decoding scores were significantly larger than their corresponding chance and significance levels. These results validate that the identified space-by-time decompositions account for relevant features of the recorded EMG data and are not just an artifactual output of the methods.
3.3 Consistency and functionality of spatial/temporal modules
We examined the composition of the extracted spatial and temporal modules and their similarity across subjects for the sake of completeness. In the space-by-time decomposition, temporal modules are T-dimensional vectors containing time-varying patterns, accounting for the timing of muscle activity. Here, the identified temporal modules were highly consistent across subjects (mean correlation coefficient r̄ = 0.92). Each temporal module was composed of a single activation burst (Fig. 3) and the four modules were successive in time to describe the temporal profile of muscle activity in different temporal windows of the full movement duration, which is a common finding in literature (Ivanenko et al, 2005, 2004; Chiovetto et al, 2010, 2013). This result may be the consequence of the common organization of all movements consisting in an acceleration phase followed by a deceleration phase.
Spatial modules are M-dimensional vectors of muscle activation levels. The identified spatial modules exhibited higher variability across subjects (mean correlation coefficient ) than the temporal ones, which might be partly explained as a result of differences in movement kinematics, the subject-specific number of spatial modules (thus requiring different muscle groupings) and/or physical discrepancies between participants (muscle sizes, skin conductance etc.). We determined four clusters of spatial modules across subjects. Each spatial module activated muscles spread across the whole body (and on both hemibodies, Fig. 4). This suggests that the extracted spatial modules represent functional muscle couplings for performing the movement at hand rather than purely anatomical or biomechanical groupings of muscles controlling the same joints or body parts.
3.4 Efficiency of the identified space-by-time decomposition in motion direction discrimination
In this part, we aimed to assess the effectiveness of the identified space-bytime decompositions with respect to task performance. To this end, we used the single-trial parameters of the decompositions, i.e. the N × P activation coefficients, to decode which of the 72 directions was performed on each trial. Decoding results are shown as an average confusion matrix across subjects (Fig. 5). Each entry of the confusion matrix C(i, j) represents the percentage of times movement j was decoded as movement i. In Figure 5, only the matrix diagonal shows high values (on average higher than 90%), which indicates highly accurate direction decoding from the way modules are recruited on single trials. We also observe two gray lines parallel to the diagonal (one above and one below the diagonal) indicating incorrect classifications for some pairs of movements (corresponding to 11% of decoding errors on average). These decoding errors concerned pointing movements starting from neighboring points on the same bar and ending at the same point on a different bar. In particular, starting points T1, T2 and T3 (higher level) were confused as T4, T5, and T6 (middle level) respectively and vice-versa (see Fig. 1 for target positions). Hence, these confusions suggest that direction decoding is harder between movements that have the same spatial direction (left or right) and the same endpoint and their starting locations differ only in the height dimension. Starting points at the lower level were confused less often (< 10% decoding errors) probably because of the higher involvement of lower body muscles required for these movements, which distinguishes them from the middle and higher level starting points. Overall, the decoding results suggest that the space-by-time decomposition succeeds in discriminating movements highly accurately. Interestingly, decoding scores were comparable to previous ones obtained during arm reaching with a lower number of kinematic degrees of freedom and fewer muscles and movements involved (Delis et al, 2013b), which suggests that direction decoding levels generalize well to more complex whole-body movements.
3.5 Effectiveness of space-by-time modularity as compared to non-modularity
To assess the effectiveness of the identified space-by-time decompositions with respect to task performance, we used the single-trial parameters of the decompositions, i.e. the N × P activation coefficients, to decode which of the 72 movements was performed in each trial. Then, in order to test the plausibility of the modular decomposition in terms of movement discrimination, we compared its decoding power with the ones obtained with the same number of parameters but taken directly from the recorded muscle activity (see Materials and Methods for details on the extraction of non-modular parameters).
Here to treat all subjects with the same methodology, we first computed the decoding performance of the space-by-time decomposition with 9, 16 and 25 single-trial coefficients (i.e. (N, P) = (3, 3), (N, P) = (4, 4) and (N, P) = (5, 5), respectively, Fig. 6, black curve). We then compared the decoding performance of the modular decomposition with the 95% confidence intervals of decoding performance obtained using three sets of parameters of equal dimensionality (red, green and blue areas) that capture the spatial, temporal and spatiotemporal structure of the EMG data respectively (see Methods for details on these computations). We found that, for all subjects, the space-by-time decomposition carried significantly higher direction discrimination power than all other three sets of parameters, which supports the effectiveness of the proposed modular decomposition in task space. Notably, the lowest performance was obtained when decoding was based only on the temporal parameters (red area) suggesting that temporal dimension carries less information about direction differences, possibly because all reaches were made of one acceleration and one deceleration phases with similar timings. On the contrary, when exploiting the spatial dimension (green area) decoding results were significantly higher. This finding suggests that direction information was mainly carried by the relative activation levels of different muscles (i.e. the spatial dimension), whereas the precise timing of muscle activations did not contribute to the discrimination of different movements. It is noteworthy that not only was decoding lower for the tested sets of non-modular parameters but also this non-modular analysis can reconstruct only a subset of the EMG data.
After establishing the direction discrimination superiority of space-bytime modularity over non-modular alternatives, we aimed to compare the decoding power of the modular decomposition with the maximal decoding that can be achieved by the dataset under investigation (i.e. including the single-trial EMG activity of all 30 muscles -as the spatial dimension carries most of the direction information- and binning their temporal activation profiles, see Materials and Methods for details on these computations). We observed that the maximal decoding performance that can be attained was 95%±4% across subjects, with 5 temporal bins (i.e. 150 decoding parameters, black curve in Fig. 7A). Further increase in the number of bins decreased the decoding performance first slightly and then drastically, confirming our previous findings about the small contribution of temporal precision to decoding. In comparison, the maximum decoding performance obtained by the spaceby-time decomposition was 91%±3% (for N=10, P=10, i.e. 100 parameters, gray curve in Fig. 7A), which was not significantly different (p = 0.54, t-test). Interestingly, a smaller set of modular parameters (25 parameters, N = 5, P = 5) already achieved decoding performance of 84%±5% (not statistically different from the maximal decoding performance, p = 0.1, t-test). Also, decoding performance with modules was always higher than decoding without modularity for comparable numbers of parameters (e.g. 86%±3% for 36 modular parameters -see gray curve- versus 75%±5% for 30 non-modular parameters -see black curve- in Fig. 7A). Notably, the extracted space-by-time decomposition using the optimal number of parameters for each subject (S1: N = 4, P = 4, S2: N = 6, P = 4, S3: N = 7, P = 4, S4: N = 5, P = 4, i.e. 22±1 parameters across subjects) yielded 86%±1% decoding, which is significantly higher than the decoding performance of the non-modular decomposition with 30 parameters (75%±1%, p < 0.05, t-test).
Regarding data approximation power, we showed that increasing the number of bins led to a gradual increase of the VAF values up to a maximum of 100% when the full waveform (50 time bins) was used (see black curve in Fig. 7B). When comparing this with the modular decomposition (gray curve in Fig. 7B), we found that for small numbers of parameters (from 1 to 100), the modular decomposition consistently accounted for equal or higher percentage of variance of the EMG dataset than the non-modular alternative (e.g. for 64 modular parameters: VAF = 79%±1% vs for 60 non-modular parameters: VAF = 75%±1%). Again, the extracted space-by-time decomposition using the optimal number of parameters for each subject (22f1 parameters across subjects) yielded 68%±1% VAF, which is significantly higher than the VAF of the non-modular decomposition with 30 parameters (55%±3%, p < 0.05, t-test). When increasing the number of parameters, the non-modular decomposition achieves higher VAF values reaching 100%, as expected when all data points are used, which would also be the case if choosing N = M and P = T for the extraction of modules (although it would be meaningless regarding the modularity hypothesis). Nonetheless, to determine whether the VAF increase is better than that expected from the increase in the number of parameters, we computed Akaike’s information criteria (AIC) for both decompositions varying the number of parameters (see Table 1 for the AIC values of the modular and non-modular decompositions with different numbers of parameters). The lowest (best) AIC is found for the modular decomposition with 4 temporal and 4 spatial modules, which suggests that this is the optimal decomposition that reconstructs the data best with the minimal number of parameters. AIC values for the non-modular decomposition are consistently higher than for the modular decomposition for all numbers of parameters. Notably, increasing the number of bins leads to higher AIC values, which suggests that the observed VAF increase is solely due to the increase in the number of parameters. Taken together with the decoding results, these findings endorse the modular decomposition as the representation achieving the best trade-off in terms of dimensionality reduction, data approximation and discrimination of motion direction.
4 Discussion
In this study, we probed the effectiveness of space-by-time modularity in describing single-trial EMG signals in a complex motor task. We designed a comprehensive experiment comprising a large number of distinct whole-body pointing movements and collected EMG signals. We assessed the extent to which a space-by-time modular decomposition of muscle activity could approximate the single-trial EMG data and convey task-relevant information. Notably, the space-by-time modular decomposition superseded all non-modular alternative descriptions tested in terms of direction discrimination data reconstruction for comparable dimensionality reduction. We therefore speculate that space-by-time modularity is a plausible piece of the whole machinery used by the CNS to generate voluntary goal-directed movements, although other processes may be necessary to fully account for EMG patterns as discussed below.
4.1 Parsimonious representation of EMG data in space and time
Four bursts of muscle activation characterized the timing of movement-related EMG activity. Notably, this set of temporal modules was very consistent across subjects and more temporal precision did not improve the characterization of the performed motion direction using decoding analysis. Refining the number of temporal modules only contributed to increasing the VAF, i.e. yielded a better reconstruction of EMG patterns. This finding shows that all task-relevant information is conveyed in four successive temporal recruitments that may correspond to different phases of goal-directed movement, i.e. postural stabilization over starting point, movement initiation, movement deceleration and stabilization over endpoint and that additional temporal modules would only be needed to characterize fine-grained but task-independent parts of muscle activity. Similar findings have been reported in a previous study that applied the space-by-time decomposition to 2-dimensional arm reaching (Delis et al, 2014) where three temporal bursts were identified (without the initial postural muscle activation) as well as in other studies examining temporal modularity (Chiovetto et al, 2010, 2013). In a similar vein, four or five temporal primitives have been shown to explain locomotive motor behaviors (Ivanenko et al, 2004, 2005; Dominici et al, 2011) and limb reaches in spinal animals (Hart and Giszter, 2004). Regarding the effect of high-pass filtering, high-frequency fluctuations of the EMG signals were analyzed in past research on spinal animals (Hart and Giszter, 2004). These studies identified temporal modules of similar shape to the ones found here, which supports the identified temporal structure of motor modules irrespective of the study-specific signal preprocessing procedure (Hart and Giszter, 2010; Kargo and Giszter, 2008). We also note that when using an extension of the sNM3F that includes temporal shifts/delays in the temporal modules (as suggested in Delis et al, 2014), we did not obtain any direction discrimination gain but VAF could be higher. This suggests that the task-related temporal structure of muscle patterns is well explained by the extracted temporal modules and that any minor variations in muscle timings likely require time shifts of those modules.
In space, our module extraction algorithm clustered muscles into a few (4-7) spatial modules consisting of muscles from different body parts and hemibodies suggesting that groupings did not arise solely as a result of anatomical constraints or biomechanical couplings. Also, each spatial module was typically activated to perform many different movements and each movement was performed using the simultaneous activation of many spatial modules, which suggests that the spatial modules are not direction-specific but rather functional groups of muscles shared across movements whose weighted recruitment actually codes the motion direction being performed (Tresch et al, 1999; Torres-Oviedo et al, 2006; Delis et al, 2013b, 2014; d’Avella et al, 2006, 2008). Notably, more variability in muscle groupings was found across participants. First, this could be due to the fact that subjects had a different optimal number of spatial modules, thus requiring different muscle groupings. Second, such inter-subject differences in muscle synergies have been reported in previous studies (Hug et al, 2010; Guidetti et al, 1996; Frère and Hug, 2012) and are expected to be more pronounced when more muscles are recorded. Indeed, in complex motor tasks (e.g. whole-body-reaching), spatial variability may increase for multiple reasons: different skin conductions or muscle characteristics, different motion kinematics and dynamics, etc. Furthermore, different muscle groupings across individuals may be the outcome of learning or developmental processes, which have recently started to be investigated (Dominici et al, 2011). Therefore, the modular control hypothesis is compatible with the finding that different participants could exhibit different motor modules. However, some muscles such as left Ta and Pr appeared to work in synergy for all participants.
4.2 VAF and discrimination power compared to other studies
We found that a small set of spatial and temporal modules described muscle activations during performance of a wide range of whole-body pointing movements. This parsimonious representation explained a large part of the variability of the EMG recordings (significantly more than chance), although the actual VAF values we obtained here may be relatively low compared to other studies. One main reason is that the present EMG data exhibited a higher level of variability as a result of a) extracting modules from single-trial data (30 trials for each motion direction), i.e. not resorting to averaging and b) studying a very large set of different directions (72, which gives 2160 recordings). When extracting modules from trial-averaged data, the average VAF across subjects was 78%, which is comparable to VAF values found in other studies using a smaller number of distinct movements and time-shifts d’Avella et al (2006). Note also that our formula for computing the VAF was relatively conservative compared to other formulas that could give arbitrarily larger VAF values, for instance by replacing the mean muscle pattern by zero in the denominator of Eq. 2(Torres-Oviedo et al, 2006). Notwithstanding this, the low VAF and the residual reconstruction error could be taken as evidence that space-by-time modularity can only give an incomplete description of EMG patterns unless the number of modules is largely increased. This was already noticed in another motor primitive study where high VAF levels were needed to accurately reproduce single muscle patterns (Zelik et al, 2014). Interestingly, we showed that the low-dimensional EMG description was nevertheless associated to a surprisingly high decoding performance in spite of the fact that direction decoding is not a direct objective of the decomposition algorithm.
In particular, the spatial (muscle) dimension of EMG activity appeared to carry more task information than the temporal dimension, which is consistent with previous findings (Delis et al, 2014). The low task information carried by the temporal modules may be partly explained by the fact that EMG single-trial data need to be normalized in time to have equal lengths before being input to matrix factorization algorithms. Time normalization is useful in order to align trials with different durations (and mandatory in current NMF-based methods), however its impact on the task information carried by the resulting signals needs to be investigated further. It is also unlikely that varying the speed instructions (i.e. including different speed conditions in the analysis) would have improved the direction discrimination power of the temporal modules as this was not the case for planar arm reaching movements Delis et al (2013b). Strikingly, however, our decoding results are comparable and even higher than the ones obtained in the simpler 2- dimensional arm reaching study mentioned above (86% versus 80%) as well as in other studies investigating grasping movements (Weiss and Flanders, 2004; Overduin et al, 2010; Leo et al, 2016). This decoding advantage we obtained here can be explained as a result of two main differences with prior work. First, we used a more flexible model of muscle activation modularity, namely the space-by-time decomposition, which was shown to have higher movement discrimination power compared to alternative models (Delis et al, 2014). Second, we investigated a whole-body reaching task, for which, in contrast to arm reaching and grasping, a) a large number of muscles with complementary functional roles can be recorded using surface EMGs and b) activations of several muscles are expected to be markedly different across motion directions. We also note that, in this study, a larger number of temporal and spatial modules was required to achieve these decoding values. This difference indicates that the number of dimensions is dependent on the set of motion directions under consideration, hence, to draw more general conclusions about dimensionality, it is important to examine other motor behaviors that are as unconstrained as possible. It must be noted here that discriminating between similar reaching movements to neighboring targets is likely considerably more difficult than discriminating between very different motor behaviors where EMG patterns differ drastically. This point is exemplified via our confusion matrix analyses. Overall, the decoding analysis reveals that recruitment of muscle groupings (spatial modules) is highly dependent on the direction of hand displacement. Their combination with adequate timing signals (temporal modules) via the descending motor commands (activation coefficients) leads to unequivocal characterization of distinct movements.
4.3 Space-by-time modularity for motor control
The high direction decoding values we found here show that a large number of movements can be reliably coded in the low-dimensional space defined by the spatial and temporal modules (Delis et al, 2013b,a; d’Avella and Lacquaniti, 2013; Alessandro et al, 2013b). Considering the neural basis of such a modular scheme, our modeling assumes that the (invariant) spatial and temporal modules may be hardwired in the spinal cord or brain stem. Then, the (variable) activation coefficients may be triggered by motor cortical structures that generate the descending neural command which recruits on each trial a specific set of modules to coordinate the task-specific recruitment of several muscles and execute the movement at hand. This view is compatible with evidence that, when stimulated, the motor cortex in the primate brain is able to coordinate behaviorally relevant actions, whereby neuronal activity may trigger such goal-directed, multijoint reaching movements (Graziano, 2006). Recently, similar evidence for a high-level encoding of ethological actions has been found in the precentral gyrus of patients undergoing brain surgery (Desmurget et al, 2014). Our findings are not incompatible with such a hierarchical neural implementation of action since the tested modular decomposition was provably highly effective for motion direction encoding and accomplishment. However, as already mentioned, the relatively low VAF could also be taken as evidence against the modular control hypothesis suggesting that the set of whole-body movements under consideration cannot be described in all its finesse with a small set of invariant spatial and temporal modules. Undeniably, the space-by-time decomposition gives only a crude picture of the recorded muscle patterns on single trials. Theoretically, all trial-to-trial EMG variations should be captured by the model but a much larger number of modules would be required to achieve VAF above an arbitrary 90% threshold, a problem which would be the same with other existing muscle synergy models (Zelik et al, 2014). While we have provided some computational arguments explaining why the VAF is lower than the one observed in prior studies on the topic, we cannot dismiss a potential departure from the proposed feedforward modular control scheme. Indeed, the role of feedback is largely neglected in such models and might be relevant to better replicate the recorded muscle patterns. Hence we speculate that the high movement discrimination capacity but medium data approximation may reflect the fact that feedback and/or intermittent control processes occur during motor execution and thus, these processes should be modeled in muscle synergy studies (see below). More generally, the question of falsifying the modular control hypothesis has proved to be a difficult challenge although it has been tackled by several neurophysiological and computational works as discussed hereafter.
4.4 Critical evaluation of modular motor control
Similarly to our approach, a large number of recent studies attempted to assess modular organizations in terms of their effectiveness in motor control and learning (Kutch and Valero-Cuevas, 2012; Valero-Cuevas et al, 2009; d’Avella and Pai, 2010; Berger et al, 2013; Berger and d’Avella, 2014; Bengoetxea et al, 2014; Inouye and Valero-Cuevas, 2016). In the same vein, other authors have proposed approaches to address the plausibility of modularity in motor control (Giszter, 2015). In particular, monkey electrophysiology (Graziano et al, 2002; Holdefer and Miller, 2002; Overduin et al, 2012, 2014, 2015), human neuroimaging (Asavasopon et al, 2014; Rana et al, 2015) and computational studies (Laine et al, 2015) were employed to investigate the neural origins of motor modularity. Also, modeling studies examined whether optimal motor control can rely on modular decompositions (Nori and Frezza, 2005; Chhabra and Jacobs, 2006; Berniker et al, 2009; Neptune et al, 2009; Alessandro et al, 2013a). Finally, studies of human motor behavior investigated the robustness of modules by imposing alterations on muscle coordination of healthy individuals (de Rugy et al, 2012, 2013; Nazarpour et al, 2012; Steele et al, 2015) and testing muscle activations in clinical populations (Gizzi et al, 2011; Clark et al, 2010; Cheung et al, 2012; Roh et al, 2013).
Here, we proposed a computational approach that addresses three important issues for the critical evaluation of modularity, in agreement with the guidelines developed by Gao and Ganguli (2015) in neuroscience. First, to assess whether modularity may be employed as a strategy for “simplifying” the degrees-of-freedom problem of motor control, modularity should be examined in high-dimensional spaces. We propose the design of an experiment that comprises as many movements as possible and multiple repetitions of the same movement executions and involve time-varying EMG recordings of as many muscles as possible in that spirit (Steele et al, 2013, 2015). Our experiment here comprises more motion directions than any other study so far and also considers a complex daily-life motor behavior of whole-body reaching while standing for which a large inter-individual variability may exist Hilt et al (2016). Second, it is crucial to assess whether modularity can fulfill its functional role, that is, reliable and unequivocal representation of a wide variety of movements. To test this, we relied upon a direction decoding metric that evaluates quantitatively the mapping between module activations and movements (Delis et al, 2013b). It must be noted that decoding is not an objective of classical modular decomposition methods that aim only at minimizing reconstruction errors. Therefore, the large direction decoding performance we discovered is quite a striking result. We further used this decoding metric to compute the maximum of task-relevant information contained in the recorded EMG signals. We argued that if our modular decomposition carries most of this information, it is a plausible strategy for performing these movements. With a view to ultimately validating or falsifying the modular hypothesis, we suggest that this approach can be a useful tool for testing the effectiveness of modular decompositions in describing direction differences. Third and most importantly, we used the above two predictions of the modularity hypothesis to directly compare modular with non-modular structures. Such a test is crucial for reaffirming the plausibility of modular decompositions and is only rarely implemented Inouye and Valero-Cuevas (2016). Here, we found that the optimal modular decomposition had significantly higher direction discrimination and data reconstruction power than its non-modular alternative, which provides an insightful contrast. We also computed AIC to assess how well the modular decomposition trades off dimensionality reduction for data reconstruction, which is at the core of the modularity hypothesis. Our findings validated our chosen number of modules and showed that the optimal modular decomposition achieved a better trade-off than the best non-modular decomposition regardless of the number of parameters. Future work in this direction is nevertheless needed, in particular to understand if higher VAF could be achieved with more advanced synergy models and if temporal modules that carry more task information could be identified.
4.5 Future work
Future research directions should involve investigating alternative formulations of the modular control hypothesis that allow refining motor programs by adapting the modular decompositions for specific task demands, possibly assuming intermittent control (Karniel, 2013). Considering how well the extracted modules allow reconstructing individual muscle activities (Zelik et al, 2014) seems also pertinent to understand if all critical features of muscle activity are considered when generating genuine muscle patterns from a small number of invariant modules. Finally, considering variants of the module extraction algorithm may improve the quality of data approximation and direction discrimination. For example, a method that incorporates the task discrimination objective within the module extraction process was shown to identify decompositions with nearly perfect task discrimination power while preserving the same levels of VAF (Delis et al, 2015). Developing methods allowing to avoid time normalization and allowing to consider recruitment of modules via feedback signals would also be very useful. In conclusion, further computational and experimental work is required to investigate the motor modularity hypothesis for the neural control of movement.
Acknowledgements
This research was supported by the « Institut National de la Santé et de la Recherche Médicale » (INSERM), the « Conseil Général de Bourgogne » (France) and the « Fonds européen de développement régional » (FEDER).