A comparison of wearable fitness devices

Background Wearable trackers can help motivate you during workouts and provide information about your daily routine or fitness in combination with your smartphone without requiring potentially disruptive manual calculations or records. This paper summarizes and compares wearable fitness devices, also called “fitness trackers” or “activity trackers.” These devices are becoming increasingly popular in personal healthcare, motivating people to exercise more throughout the day without the need for lifestyle changes. The various choices in the market for wearable devices are also increasing, with customers searching for products that best suit their personal needs. Further, using a wearable device or fitness tracker can help people reach a fitness goal or finish line. Generally, companies display advertising for these kinds of products and depict them as beneficial, user friendly, and accurate. However, there are no objective research results to prove the veracity of their words. This research features subjective and objective experimental results, which reveal that some devices perform better than others. Methods The four most popular wristband style wearable devices currently on the market (Withings Pulse, Misfit Shine, Jawbone Up24, and Fitbit Flex) are selected and compared. The accuracy of fitness tracking is one of the key components for fitness tracking, and some devices perform better than others. This research shows subjective and objective experimental results that are used to compare the accuracy of four wearable devices in conjunction with user friendliness and satisfaction of 7 real users. In addition, this research matches the opinions between reviewers on an Internet site and those of subjects when using the device. Results Withings Pulse is the most friendly and satisfactory from the users’ viewpoint. It is the most accurate and repeatable for step and distance tracking, which is the most important measurement of fitness tracking, followed by Fitbit Flex, Jawbone Up24, and Misfit Shine. In contrast, Misfit Shine has the highest score for design and hardware, which is also appreciated by users. Conclusions From the results of experiments on four wearable devices, it is determined that the most acceptable in terms of price and satisfaction levels is the Withings Pulse, followed by the Fitbit Flex, Jawbone Up24, and Misfit Shine.


Background
Nowadays people, are very interested in wearable devices as these are the trend in technology for the tracking of daily life activities. The best activity life trackers on the market today are highly evolved cousins of pedometers. They are smarter and more accurate and can do much more than just calculate how far you walk [1].
A wearable device is a new type of technology in the form of small hardware that includes an application with tracking and monitoring fitness metrics such as distance walked or run, calories consumed, and in some devices heart rate and sleep tracking. The term is now used primarily in reference to dedicated electronic monitoring devices that are synced, in many cases wirelessly, to a computer or smartphone for long-term data tracking. There are also smartphones with the independent ability to track [2]. Wearable devices are tiny, state-of-the-art computers that users wear on various parts of their bodies, such as glasses [3], smart watches, wristbands, or bracelets [4] clipped onto the clothing [5].
Wearable technology has become popular; it allows the wearer to access information in real time. Applications can be used in the fields of health, fitness, food, and aging [1]. Further, it is possible to automate the monitoring and recording of daily activities or fitness. It is also possible to integrate them into more easily worn equipment. The wearable device should monitor workouts and display information about the user's daily routine on its screen or on a smartphone. This is a more comfortable and convenient method for the wearer than the old method, which required one to calculate the distance or running steps manually.
Reviews of wearable trackers appear on many Internet sites. Often, they show different opinions about the reviewed products. However, these opinions are subjective and do not show any research results that provide the accuracy of information on devices or the identity of the subjects in the experiments or the reviewer. Further, there is no objective data like concrete comparison table to show the results of the subjects reviewed. For example, from "Top Ten Reviews" [7], the best wearable device reviewed was the Fitbit, followed by the Jawbone and Withings Pulse. From this site, the scores were tabulated and compared, but no details were provided about where the information originated. Another example of a wearable tracker review site is ""Best Fitness Tracker". From "TechAdvisor"" [8], the best tracker was Jawbone, followed by Misfit, Fitbit, and Germinly. The reviews on this site do not include a physical comparison table, but, as mentioned on the site, are only reviews from a single blogger. Even though this kind of review website has no objective information, it is advantageous for customers who plan to buy this kind of product because it can help them find the most suitable option. This process may be improved to better benefit customers if the reviews for wearable devices had real objective comparison results, which would help customers best fit their needs.
This paper summarizes and compares the satisfaction, user friendliness, and accuracy of currently popular wearable devices (wristband type) that are found in the top ten of best 2015 fitness trackers according to reviews and comparisons [6][7][8]. Four wearable devices were chosen randomly from the top ten products in the review comparisons. The four selected wearable devices were the Fitbit Flex (Fitbit Inc., San Francisco, California, USA) [9], Withings Pulse (Withings SA, Issy les Moulineaux, France) [5], Misfit Shine (Misfit Inc., Apple Inc., Apple, Mitten Rd., Burlingame, California, USA) [10] and Jawbone Up24 (Jawbone, San Francisco, California, USA) [11]. Subjective and objective research results will reveal the trackers with the best accuracy and user friendliness based on physical information from real users.
All have multifunction capabilities, such as a step counter, caloric tracker, distance counter, and sleep tracker. The functions are similar, but each device differs in calculation algorithm, user interface, and application. This paper reviews 1) the overall specifications of the four devices-for example, hardware, functions, features in the application on the smartphone; 2) a comparison of the user satisfaction scores; 3) users' opinions in experiments; 4) reviews of the wearable devices by bloggers or reviewers from Internet sites with a comparison based on a physical information and personal observations by real users; and 5) the accuracy and repeatability of activity tracking for each model.

Wearable devices in experiments
The selection of the four wearable devices in the experiments was made randomly for wristband devices available in Korea from among the devices in the top ten review ranking [6,7] (see Fig. 1). The four devices are described in detail below. Table 1  counting and distance travelled. The Jawbone Up24 is designed with only one operating button and has a price of US $100 [7,11].
The user interface application (UI app) of each device Most wearable devices differ in their user interfaces. The UI design for wearable devices should be simple, clear, and quick to navigate for users' comfort [15]. This can be difficult because wristband type wearable devices are small. Thus, the UI app on a smartphone that links to the wearable device is also an important feature for users. The companion application of a wearable device on a smartphone must be available for easy download. Handheld apps are also useful for heavy processing, analysis, data storage, network actions, or other work [16]. Table 2 shows the comparison of the UI app on smartphones for the four devices explored.

Participants in experiments
Seven healthy subjects participated in the experiments, comprising six healthy men (adults aged 27-50 years, mean age 31 years, mean height 171.5 cm, and mean weight 68.18 kg), and one healthy woman (adult aged 30 years, height 160 cm, and weight 42.1 kg). All participants were graduate students of department of Electrical and Electronic Engineering, Hankyong National University, South Korea. All clinical experiments were carried out from July 2015 to August 2015 with the approval (GIRBA2248) of the Gachon University Institutional Review Board (Incheon, South Korea).
All subjects wore each wearable device for 1 week, changing them after the end of each week. During the use of the devices, all subjects were asked to note the results of use, scores for satisfaction, and opinion about the advantages and disadvantages of each device. The uses of all four devices for one research evaluation were then tested to check and compare the accuracy of each device (the details of the experiments are explained in the following Experimental Methods).

Satisfaction of subjects using the wearable devices
In this experiment, each subject wore a wearable device for 1 week, after which they all completed the satisfaction evaluation form, consisting of two sections. (One subject wore the devices for 1 month to test all four devices). The scale satisfaction evaluation form consisted of two sections.    Section 1. The Likert scale evaluation for each device Subjects provided a Likert score for each condition of each device on a maximum five-point scale, according to its general design, features, and functionality after wearing and using it for a week. The scale of satisfaction ranged from one to five points (see Table 3). The two parts of the evaluation form consisted of the following: Part 1. The satisfaction score for features and properties In this part, the subjects scored their satisfaction with the features and properties of each device. This included the general design (hardware), synchronization, user interface (UI app), battery, friendliness, and ease of use.
Part 2. The satisfaction score for the metric function of the devices In this part, the subjects scored their satisfaction with the metric function of each device. This included step, distance, sleep, and calorie (nutrient) analysis.

Section 2. Opinion on each device
In this section, the subjects registered personal comments about the advantages and disadvantages they observed while using each device. Subsequently, the personal opinions and comments from the subjects are shown.

Experiment for accuracy and repeatability of each device
The functions of the wearable devices on the market are similar. However, each device differs in calculation algorithm, user interface, and application. Accuracy and repeatability are two factors that lead the wearer to the real finish line, goal, or diet limit. Nevertheless, other factors include weight, height, age, and gender. Thus, suggesting the best among these four wearable devices requires exploration of the accuracy and repeatability of the devices using objectivemethod and real experimental data. The four devices were attached to a subject's wrist (see Fig. 2). The accuracy and repeatability after testing were measured. The three experiments tested the distance travelled to determine the accuracy and repeatability of all devices. The repeatability was calculated using Cronbach's Alpha, SPSS program (SPSS V.2012, IBM Corporation, USA). Subsequently, we scaled scoring among the four devices from best to worst, as defined in Table 4 Step counting when walking up and down stairs Subjects wore the devices while walking up four flights of stairs; this was repeated five  times. The subjects then walked down the stairs, which was repeated five times. After all data for the experiments were collected, the accuracy and repeatability scores were assigned to the devices on a scale from one to four, with four representing the best accuracy and best repeatability among the four devices (see Table 4).

Satisfaction of subjects
After the subjects wore and used each device for a week, they entered the Likert scores into the evaluation form, which included details about the features and properties of the devices, including the UI application. The scale of satisfaction scores is displayed in Table 3. Figure 3 shows the mean score for the five conditions of features, including device design, battery use, smartphone synchronization, UI applications, and ease of use. Figure 4 shows the mean and standard deviation scores of the satisfaction when using the four main functions of each device, including step counting, sleep tracking, distance tracking, and caloric (or nutrient) analysis. The case of heart rate analysis does not exist in the evaluation score because only the Withings Pulse possessed this function.
From the results, the Withings Pulse had the highest satisfaction score, followed by the Misfit Shine, Jawbone Up24, and Fitbit Flex.

User feedback
We summarized the opinions of the seven subjects gathered while using the devices. The results in Table 5 show results that came from similar answers from two or more subjects.
From Table 5, it is apparent that all four devices received satisfactory and unsatisfactory feedback from the subjects.

Additional information (opinions of commercial reviewers on internet sites about the advantages and disadvantage of the devices)
This section is a summarized account of the advantages and disadvantages of devices based on the claims of reviewers on related websites. The selection of review sites was based on the first five listed and ranked on the Google search engine [36] when entering a device name followed by the keyword "review," such as "Jawbone Up24 review." From the first five ranked sites on Google search, it is apparent that these reviews are famous based on the number of interested parties who visit the sites about wearable devices. These claims by reviewers might help customers seeking to buy a device make a choice more easily. Although reviews on websites can be advantageous, nobody can be certain whether the claims are influenced by the manufacturer or are genuine reviews from independent sources. An opinion or claim may come from only one subject or only the reviewer who uses a product. Lowest accuracy or repeatability among the four devices Fig. 3 Bar graph comparing mean and standard deviation of the feature satisfaction scores given by subjects when using the devices This is explored to determine whether the pros and cons claimed by the reviewers are similar to the customers' and seven subjects' opinions in this study. Tables 6,7,8,9 shows the summarized data of advantages (pros) and disadvantages (cons) for each of the four devices from reviewers on the websites.
In Tables 5 and 6, matching opinions are shown between the seven subjects and those of the reviewers, implying that the Jawbone Up24 has a good design and fits comfortably. The UI app is colorful and easy to understand. The sleep tracker is very smart and also has good alarm functions. However, disadvantages of the device (cons) include the lack of a screen, inadequate waterproofing, and a complex battery charger.
Tables 5 and 7 display matching opinions between the seven subjects and the reviewers, implying that the Withings Pulse has good primary features, such as the heart rate function. The display is large and can show the tracking results. The data log updates itself via wireless synchronization using Bluetooth. However, the Withings design is not impressive. The display is difficult to see and read in sunlight, and the sleep tracking is not automatic. Tables 5 and 8 list the matching opinions between the seven subjects and the reviewers, who agree that the Fitbit Flex has a sleek, slim, and good design; is fully water resistant; and has strong social features. However, its weak points include its lack of a screen, difficulty in using the food log and calorie tracking on UI app, and a screen tapping on the device that is sometimes confusing.
Finally, Tables 5 and 9 show the matching opinions between the seven subjects and the reviewers, who agree that the Misfit Shine has an attractive, elegant, and fashionable design. It is highly waterproof and especially good for watersports. The goal tracking motivates the user, and the battery requires no recharge but rather an exchange. However, the Misfit Shine works only with iOS. Android compatibility has been announced, but is not yet available. The display to check the tracking status requires a smartphone. Sometimes, data are inaccurate because of lost syncing to the smartphone.
The most obvious problem among the devices was that all of them experienced automatic loss of synchronization, making it difficult or impossible to update data or resulting in an incorrect report. In contrast, all subjects could use the devices easily and required little to no instruction. This means that the devices were user friendly and easy to use. Table 10 shows that the best device for accuracy and repeatability of indoor walking measurements is the Withings Pulse, with an accuracy of 99.90 % and repeatability of 0.86. The total scores for each device are shown in Fig. 5. The Withings Pulse has the highest score among the four devices for both repeatability and accuracy. The lowest accuracy and repeatability were recorded by the Misfit.

Experiment for accuracy and repeatability of each device
With regard to Table 5 (opinions of the seven subjects) and Tables 6,7,8,9 (opinions of the reviewers), we concluded that the Misfit and Fitbit have difficulty in detecting when a user climbs or descends stairs. In addition to the subject's experiments in Table 10 and Fig. 5, the scores from experiments with climbing and descending stairs are both lowest in terms of accuracy and repeatability. Thus, the total scores of the Misfit and Fitbit are the lowest among the four tested wearable devices in terms of accuracy and repeatability.

Discussion
As the results showed, the reason for the low scores earned by the Misfit Shine and Fitbit Flex was stair tracking. These two devices could not track activity when the subjects climbed or descended stairs. For this reason, users were disappointed in these devices.

Section 1
The satisfaction evaluation form considered eight conditions: synchronization, UI app, hardware design, step counting, sleep tracking, nutrient analysis, caloric analysis, battery, and ease of use. The highest satisfaction among the five users of the four devices  was gained by Withings, with Misfit, Jawbone, and Fitbit following behind. In addition to Section 1, the opinions of the seven subjects and reviewers on Internet sites were summarized. This showed that each device has different advantages (pros) and disadvantages (cons). However, from the evaluation form and satisfaction scores, the subjective results of real -Jawbone Up24 is well designed and fits the subjects comfortably. The UI app is colorful and easy to understand. The sleep tracker is very smart and also has good alarm functions. However, disadvantages (cons) include the lack of a screen, inadequate waterproofing, and a complex battery charger.
-Withings Pulse has good features such as the heart rate function, which can detect pulse rate. The Withings display is large and can show the tracking results on its display. The data log updates itself via wireless or Bluetooth syncing. However, the Withings design is not impressive: the display is difficult to see and read in sunlight, and the sleep tracking is not automatic.
-Fitbit Flex has a sleek, slim, and good design; is fully water resistant; and has strong social features. However, its weak points include no screen only Led and a tap screen that is sometimes confusing.
The UI app is difficult to use food log and caloric tracking, a steep learning curve.
-Misfit Shine has an attractive, elegant, and fashionable design. It is fully waterproof and especially good for watersports. The goal tracking function motivates the user to achieve goals, and the battery does not need recharging but rather exchanging. However, the Misfit Shine works only with iOS. Android compatibility has been announced, but is not yet operational. The display to check tracking status requires a smartphone because it has no built-in display. Data are sometimes inaccurate because of lost syncing with the smartphone.

Section 2
The experiments compared the accuracy and repeatability of the four wearable devices. Four points were awarded for the best accuracy and repeatability, and three, two, and one point were given to the second, third, and fourth devices, respectively. The most accurate and repeatable device was the Withings, followed by the Jawbone, Fitbit, and Misfit. In contrast, the Misfit had the highest score for design and hardware. Thus, physical design is also appreciated by users. The Withings was the most friendly and satisfactory from the users' viewpoint. The Withings was also the most accurate and repeatable for step and distance

Conclusion
This research attempted to evaluate the best among the four wearable devices selected. This study focused on both objective and subjective methods to obtain results based on physical comparison. The results are independent of manufacturers' claims. The main two methods of testing verified the quality of the devices, both objectively and subjectively. From the author's viewpoint, the most common criticism of wearable devices is that they cannot display information but require a smartphone to send the metric data and reports. The capacity for storage of results is larger in a mobile phone, but it is inconvenient to use both at the same time. Moreover, many fitness tracking applications are presently available through online stores for free without requiring any special or specific hardware. This is very convenient for people who focus on their health or fitness. Although the reports generated by such apps are not guaranteed to be 100 % accurate, they provide the easiest way to track users' activity without any cost. Thus, the companies that have introduced fitness trackers or wearable devices into this highly competitive market can continually develop new eye-catching products and reduce errors by listening to the feedback and opinions of users from this study to reach a wider market. Technology and aesthetics must go together; unobtrusive designs that are sleek, modern, and lightweight; waterproof functionality; multiple options for recharging the battery; accuracy and repeatability for simple activities such as climbing or descending stairs; and the monitoring of vital parameters (heart rate, pulse rate, body temperature, respiration, or others) should be considered or added. Nonetheless, the present development of wearable devices is moving rapidly with the release of numerous gadgets and new generations. This paper addresses consumer needs with information regarding the performance of four such new gadgets.

Ethics approval and consent to participate
All clinical experiments were carried out from July 2015 to August 2015 with the approval (GIRBA2248) of the Gachon University Institutional Review Board (Incheon, South Korea). All participant is voluntary. A written informed consent was obtained from each participant. A copy of the signed consent form as well as instructions regarding the fasting period and contact information was delivered to each participant. Also, they have option of withdrawing or discontinuing at any time before and during data collection.