From 19ad80b0ac5da1366f1a8ed64d211ec05d878a39 Mon Sep 17 00:00:00 2001 From: LiLei Date: Mon, 7 Aug 2017 11:59:55 +0800 Subject: [PATCH 01/15] =?UTF-8?q?=E6=B7=B1=E5=BA=A6=E5=AD=A6=E4=B9=A0?= =?UTF-8?q?=E7=B3=BB=E5=88=971=EF=BC=9A=E8=AE=BE=E7=BD=AE=20AWS=20&=20?= =?UTF-8?q?=E5=9B=BE=E5=83=8F=E8=AF=86=E5=88=AB=20(#1967)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...ning-1-setting-up-aws-image-recognition.md | 94 ++++++++++--------- 1 file changed, 50 insertions(+), 44 deletions(-) diff --git a/TODO/deep-learning-1-setting-up-aws-image-recognition.md b/TODO/deep-learning-1-setting-up-aws-image-recognition.md index b2fb2c0a009..5a18e49954a 100644 --- a/TODO/deep-learning-1-setting-up-aws-image-recognition.md +++ b/TODO/deep-learning-1-setting-up-aws-image-recognition.md @@ -3,104 +3,110 @@ > * 原文作者:[Rutger Ruizendaal](https://medium.com/@r.ruizendaal) > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-1-setting-up-aws-image-recognition.md](https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-1-setting-up-aws-image-recognition.md) -> * 译者: -> * 校对者: +> * 译者:[lileizhenshuai](https://github.com/lileizhenshuai) +> * 校对者:[Tina92](https://github.com/Tina92) [sqrthree](https://github.com/sqrthree) -# Deep Learning #1: Setting up AWS & Image Recognition +# 深度学习系列1:设置 AWS & 图像识别 -*This post is part of a series on deep learning. Check-out part 2 *[*here*](https://medium.com/@r.ruizendaal/deep-learning-2-f81ebe632d5c)* and part 3 *[*here*](https://medium.com/@r.ruizendaal/deep-learning-3-more-on-cnns-handling-overfitting-2bd5d99abe5d)*.* +**这篇文章是深度学习系列的第一部分。你可以在[这里](https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-2-convolutional-neural-networks.md)查看第二部分,以及[这里](https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-3-more-on-cnns-handling-overfitting.md)查看第三部分。** ![](https://cdn-images-1.medium.com/max/1600/1*y3guCmNkYLF2uR09Fslh5g.png) -This week: classifying images of cats and dogs -Welcome to this first entry in this series on practical deep learning. In this entry I will setup the Amazon Web Services (AWS) instance and use a pre-trained model to classify images of cats and dogs. +本周的任务:对猫和狗的图像进行分类 -In this complete series I will be blogging about my process in the first part of the Fast AI deep learning course. This course was first given at the Data Institute at the University of San Francisco and is now available as a MOOC. Recently the authors gave part 2 of the course which will become available online in a couple of months. The main reason for following this course is my extreme interest in deep learning. I have found many online resources regarding machine learning but practical courses on deep learning seem to be a rarity. Deep learning seems to be an exclusive group that is just a little harder to get into. The first thing needed to start on deep learning is a GPU. In this course we use the p2 instance from AWS. Let’s get that set up. +欢迎阅读本系列第一篇关于实战深度学习的文章。在本文中,我将创建 Amazon Web Services(AWS)实例,并使用预先训练的模型对猫和狗的图像进行分类。 -The first week of this course really focused on the setup. Getting your deep learning setup right can take a while and it is important to get everything working correctly. This includes setting up AWS, creating and configuring the GPU instance, setting up the process of ssh-ing into the server and managing your directories. +在这个完整的系列里,我会记录下我在 Fast AI 深度学习课程的第一部分内容的进度。这门课程最初是由旧金山大学数据研究所提供的,并且现在能够在 MOOC 上观看。最近,这门课的作者提供了第二部分的内容,并且在接下来的几个月都可以在网上观看。我上这门课的主要是因为我对深度学习有着强烈的兴趣。我在网上发现了许多关于机器学习的课程,但有关深度学习的实战课程还是比较少见的。深度学习似乎因为进入门槛略高一点,而被单独列出。开始深度学习之前我们首先需要一个 GPU,在这门课程里我们会使用 AWS 的 p2 实例。现在让我们一起来准备它。 -I ran into some issues with permissions on my internship laptop. Let me give you one tip that will save a lot of time in trying to bypass this: Make sure you have full administrator access on your laptop before attempting this. Some lovely engineers offered to setup the GPU instance for me, but they didn’t have time to do it soon. So I decided to take matters into my own hands. +这门课程第一周,我们会把重点放在准备工作上。正确地准备深度学习需要一点时间,但这对一切能正确运行很重要。这包括了设置 AWS,创建和配置 GPU 实例,设置 ssh 连接服务器以及管理你的目录。 -The scrips for setting up the AWS instance are written in bash. If you’re working on a Windows machine you will need a program that can handle this. I’m using Cygwin. I want to share some issues (and their solutions) that I ran into during the install. You can skip this if you’re not following the Fast AI course and just reading along. Some issues that I ran into during the setup process were: +我在实习期用的笔记本电脑上遇到了一些权限问题。我有个建议能够避免这个问题,从而帮你节省大量时间:在尝试操作之前,确保你在你的笔记本电脑上拥有完整的管理员权限。一些热情的工程师提出帮助我设置 GPU 实例,但是他们不能马上帮我搞定,所以我决定自己来。 -- The bash scripts throw an error +用来设置 AWS 的脚本是用 bash 写的,如果你用的是 Windows 操作系统,那么你需要一个能够处理它的程序,我用的是 Cygwin。我想分享一些在设置过程中我遇到的问题(以及对应的解决方案)。如果你没有在上 Fast AI 课程,你可以跳过这部分继续阅读。我在设置过程中所遇到的问题有: -I have read some possible explanations for this, but not a clear solution that worked for me. The setup script of the course on Github is now split in two scripts: setup_p2.sh and setup_instance.sh. In case you cannot get these two scripts to work you can use [this](https://github.com/ericschwarzkopf/courses/blob/dc06ce745a30850e7937858fb26a67df2aff329d/setup/setup_p2.sh) script to setup your p2 instance. If the script does not run be sure to try the raw version as well. +- bash 脚本报错 -I had a similar issue with the aws-alias.sh script. Adding a ‘ at the end of line 7 fixed this issue. Here is a before and after of line 7: + 我看过一些可能的原因,但是没有一个是对我有用的解决方案。Github 上这个课程的设置脚本有两个:setup_p2.sh 和 setup_instance.sh。如果上面那两个脚本不能用,你可以用[这个](https://github.com/ericschwarzkopf/courses/blob/dc06ce745a30850e7937858fb26a67df2aff329d/setup/setup_p2.sh)脚本试试。但如果这个脚本还是不行,请务必再尝试使用原始版本的脚本。 - alias aws-state='aws ec2 describe-instances --instance-ids $instanceId --query "Reservations[0].Instances[0].State.Name" + 我在 aws-alias.sh 这个脚本上也遇到了同样的问题,在第七行的末尾加上 `'` 能够解决这个问题。下面是修改前和修改后的第七行: - alias aws-state='aws ec2 describe-instances --instance-ids $instanceId --query "Reservations[0].Instances[0].State.Name"' + > alias aws-state='aws ec2 describe-instances --instance-ids $instanceId --query "Reservations[0].Instances[0].State.Name" + + > alias aws-state='aws ec2 describe-instances --instance-ids $instanceId --query "Reservations[0].Instances[0].State.Name"' -[Here](https://gist.github.com/LeCoupa/122b12050f5fb267e75f) is a Bash cheat sheet for everyone who is not familiar with Bash. I greatly recommend this since you will need Bash to interact with your instance. + [这里](https://gist.github.com/LeCoupa/122b12050f5fb267e75f)有一个为不熟悉 Bash 的人准备的 Bash 备忘录,因为你需要通过 Bash 来和你的实例进行交互,所以我非常推荐你去看看。 -- The Anaconda install. The video mentions that you should install Anaconda before installing Cygwin. This can be a bit confusing as you need to use the ‘Cygwin python’ to run the pip commands in there and not a local Anaconda distribution. +- Anaconda 的安装。视频中提到你需要在安装 Cygwin 之前先安装 Anaconda。你可能感到有些疑惑,因为你需要用“Cygwin python”来运行 pip 命令而不是一个本地的 Anaconda 分发版。 -Additionally, [this](https://github.com/TomLous/practical-deep-learning) repository has a nice step-by-step guide on getting your instance running. +另外,[这个](https://github.com/TomLous/practical-deep-learning)仓库有一个手把手的教程教你如何让你的实例运行起来。 --- -#### Getting started with Deep Learning +#### 开始深度学习 -After some issues I got my GPU instance running. Time to get started with deep learning! A quick disclaimer: in these blogs I won’t be repeating exactly what is listed in the lesson notes, there is no need for that. I will be highlighting some things that I found really interesting, as well as issues and ideas that I ran into while going through the lesson. +解决了一些问题之后我总算让我的 GPU 实例运行起来了。是时候开始深度学习了!一个简短的免责声明:在这一系列博客中,我不会重复已经在课程笔记中列出的内容,因为没必要。我会强调一些我觉得很有趣的事情,以及我在课程中遇到的问题和一些想法。 -Let’s start with the first question that is probably on your mind: **What is deep learning and why is it experiencing such hype right now?** +让我们从第一个可能已经在你脑海中的问题开始:**什么是深度学习?它现在为什么被炒得这么火?** -Deep learning simply is an artificial neural network with multiple hidden layers, this makes them ‘deep’. A general neural network only has one, maybe two hidden layers. A deep neural network has much more hidden layers. They also have different types of layers than the ‘simple’ ones in the normal neural network. +深度学习只是一个有着多个隐含层的人造神经网络,隐含层让它变得“深度”。一般的神经网络只有一层或者两层的隐含层,而一个深度神经网络有更多的隐含层。它们也具有与一般神经网络中的“简单”层不同类型的层。 ![](https://cdn-images-1.medium.com/max/1600/1*CcQPggEbLgej32mVF2lalg.png) -(Shallow) Neural Network -Currently deep learning is consistently beating performance on well-known datasets. Therefore deep learning has been experiencing a lot of hype. There are three reasons for the popularity of deep learning: +(浅) 神经网络 -- Infinitely flexible function -- All-purpose parameter fitting -- Fast and scalable +目前,深度学习在一些著名的数据集上不断地有着出色的表现,所以深度学习也经历了不少的炒作。深度学习的流行有三个原因: -The Neural Network is modeled after the human brain. According to the universal approximation theorem it can theoretically solve any function. The Neural Network is trained through backward propagation, which allows us to fit the parameters of the model to all these different functions. The last reason is the main one for the recent achievements in deep learning. Because of advancements in the gaming industry and the developments of powerful GPUs it is now possible to train deep neural networks in a fast and scalable way. +- 无限灵活的函数 +- 通用参数拟合 +- 迅速以及可拓展 -In this first lesson the goal is to use a pre-trained model, namely Vgg16 to classify images of cats and dogs. Vgg16 is a lightweight version of the model that won the Imagenet challenge in 2014. This is a yearly challenge and probably the biggest one in computer vision. We can take this pre-trained model and apply it to our dataset of cat and dog images. Our dataset has been edited by the authors of the course to make sure it is in the right format for our model. The original dataset can be found on [Kaggle](https://www.kaggle.com/c/dogs-vs-cats). When this competition was originally run in 2013, the state of the art was 80% accuracy. Our simple model will already achieve 97% accuracy. Mind-blowing right? This is how some of the pictures and their predicted labels look: +神经网络是通过模仿人脑而设计的。根据通用近似定理,它理论上能拟合任何函数。神经网络通过反向传播算法来训练,这使得我们能够调整模型的参数来适应不同的函数。最后一个原因,也是深度学习近期取得众多成就的主要原因。因为游戏行业的进步和 GPU 计算能力的强劲发展,现在我们以非常快速和可扩展的方式来训练深层的神经网络。 + +在第一节课里,我们的目标是使用一个叫做 Vgg16 的预先训练好的模型,来对猫和狗的图片进行分类。Vgg16 是 2014 年赢得 Imagenet 比赛模型的一个轻量级版本。这是一个年度的比赛并且可能是计算机视觉方面最大的一个比赛。我们可以利用这预先训练好的模型,并且把它应用到我们的猫和狗的图片数据集上。我们的数据集已经被课程的作者编辑过了,以确保它的格式正确。原始的数据集可以在 [Kaggle](https://www.kaggle.com/c/dogs-vs-cats) 上找到。这场比赛最初是在 2013 年进行的,那时的准确率是 80%。而我们的简单模型已经能够达到 97%的准确度。大脑现在还清醒吧?下面是一些照片和他们被预测的标记: ![](https://cdn-images-1.medium.com/max/1600/1*y3guCmNkYLF2uR09Fslh5g.png) -Predicted labels for dogs and cats -The target labels are setup using a process called one-hot-encoding which is often used for categorical variables. [1. 0.] refers to a cat and [0. 1.] refers to a dog. Instead of having one variable named ‘target’ with two levels 0 and 1, we create an array with two values. You can look at these variables as ‘cats’ and ‘dogs’. If the variable is true it gets labeled as a 1 and otherwise as a 0. In a multi-classification problem this can mean that your output vector looks like this: [0 0 0 0 0 0 0 1 0 0 0]. In this case the Vgg16 model outputs the probability of the image belonging to class ‘cat’ and the probability of the image belonging to the class ‘dog’. The next challenge is to tweak this model so we can apply it to another dataset. +狗狗们和猫猫们被预测的标记 + +我们用叫做独热编码的方法来处理目标标记,这是分类问题中常用的方法。[1. 0.] 说明图片中是一只猫, [0. 1.] 则说明是一只狗。我们没有用一个叫做“目标”的有 0 和 1 两种取值的变量,而是创建了一个包含两个值的数组。你可以把这些变量看成“猫猫”和“狗狗”。如果变量为正,那么它就会被标记为 1,否则就是 0。在一个多分类问题中,这意味着你的输出向量可能长成这样:[0 0 0 0 0 0 0 1 0 0 0]。在这个例子中,Vgg16 模型会输出图片属于“猫”这个类别的可能性以及属于“狗”这个类别的可能性。接下来的一个挑战是调整这个模型,以便我们将其应用于另一个数据集。 --- -#### **Dogs vs. Cats Redux** +#### **狗狗还是猫猫 终极版** -Essentially this is the same dataset as the previous one, but it is not pre-processed by the authors of the course. The Kaggle Command Line Interface (CLI) provides a quick way to download the dataset. It can be installed via pip. A dollar sign is often used to show that a command is run in the terminal. +本质上这个数据集和先前的是同一个数据集,但是没有被课程作者预处理过。Kaggle 命令行接口(CLI)提供了一个快捷的方法来下载这个数据集,可以通过 pip 来安装。一个美元标志通常用来表示命令运行在终端中。 $ pip install kaggle-cli -The training set contains 25.000 labeled images of dogs and cats, while the test set contains 12.500 unlabeled images. In order to finetune the parameters we also create a validation set by taking a small part of the training set. It is also useful to set-up a ‘sample’ of the full dataset that you can use to quickly check if your model is working during the building proces. +训练数据集中有 25000 张已经被标记为猫或是的狗的图片,测试数据集中则包含 12500 张未被标记的图片。为了调整参数,我们还通过占用训练集的一小部分来创建验证数据集。设置一个完整数据集的“样本”也很有用,可以用来快速检查你的模型在构建过程中是否正常工作。 -In order to run our model we use the Keras library. This library sits on top of the popular deep learning libraries Theano and TensorFlow. Keras basically makes it more intuitive to code your network. This means that you can focus more on the structure of the network and worry less about the TensorFlow API. In order to know which picture belongs to which class Keras looks at the directory it is stored in. Therefore, it is important to make sure you move the images to the correct directories. The bash commands that are needed to do this can be run directly from the Jupyter Notebook where we do all our coding. [This](https://www.cyberciti.biz/faq/mv-command-howto-move-folder-in-linux-terminal/) link contains additional information on these commands. +我们使用 Keras 库来运行我们的模型,这个库是基于 Thenao 和 TensorFlow 的最流行的深度学习库之一。Keras 能够让你更加直观地来编写神经网络,这意味着你能够更多地关注神经网络的架构而不用担心 TensorFlow API。因为Keras 通过查看图片所属的目录来确定它的类别,所以把图片移动到正确的目录非常的重要。这些操作所需的 bash 命令可以直接在 Jupyter Notebook 中运行,也就是我们写代码的地方。[这个](https://www.cyberciti.biz/faq/mv-command-howto-move-folder-in-linux-terminal/)链接包含了额外的一些关于这些命令的信息。 -One epoch, which is a full pass through the dataset, takes 10 minutes on my Amazon p2 instance. In this case that dataset is the training set which consists of 23.000 images. The other 2000 images are in the validation set. I decided to use 3 epochs here. The accuracy on the validation set is around 98%. After training the model we can take a look at some of the correctly classified images. In this case we use the probabilities of the image being a cat. 1.0 refers to full confidence that the image is of a cat and 0.0 that the image is of a dog. +一个 epoch,也就是在数据集完整地跑一遍,在我的 Amazon p2 实例上花费了 10 分钟时间。在这个例子里数据集是包含 23000 张图片的训练数据集,另外的 2000 张图片被保留下来作为验证数据集。在这里我决定使用 3 个 epoch。在验证数据集上的准确度在 98% 左右。训练好模型之后,我们可以看一些被正确分类的图片。在这个例子里,我们用图片中是一只猫的概率作为结果。1.0 表示模型非常自信地认为图片中是一只猫,而 0.0 则表示图片中是一只狗。 ![](https://cdn-images-1.medium.com/max/1600/1*fgOX3G_imeRsodKuBBA8Tg.png) -Correctly classified images -Now let’s take a look at some of the wrongly classified images. As we can see most of them are taken from far away and feature multiple animals. The original Vgg model was used for images where one thing of the target class was clearly visible in the picture. Am I the only one who finds the fourth picture slightly terrifying? +被正确分类的图片 + +现在让我们来看一些被错误分类的图片。正如我们所见,这些图片大部分是从远处拍摄的,并且图片里有多种动物。原始的 Vgg 模型是用在图片中只有一种清晰可见目标类别中的。只有我觉得第四张图片有点可怕吗? ![](https://cdn-images-1.medium.com/max/1600/1*jD6t1ifVrrGq571eh5lqhA.png) -Incorrectly classified images -Finally, these are the images that the model was most uncertain about. This means that the probability was closest to 0.5 (where 1 is a cat and 0 a dog). The fourth picture features a cat where only the face is visible. The first and third picture are rectangular and not square like the the pictures the original model was trained on. +被错误分类的图片 + +最后,这些是模型对其类别最不确定的一些图片。这意味着概率非常接近 0.5(1 代表是一只猫而 0 代表是一只狗)。第四张图片中的猫只有一张脸露出来。第一张和第三张图片是长方形的而不是原模型训练集中的正方形。 ![](https://cdn-images-1.medium.com/max/1600/1*zlSUpvspBf9zYm175uaY1w.png) -Images where the model is most uncertain -That’s it for this week. Personally I can’t wait to get started on lesson 2 and learn more about the internals of the model. Hopefully we will also start on building a model from scratch with Keras! +模型最不确定的图片 -Also, thanks to everyone who is updating the Github scripts. It helped a lot! Another thank you to everyone on the Fast AI forums, you’re awesome. +这就是这周的内容。就我个人而言,我已经迫不及待地想要开始第二周的课程并且学习更多关于这个模型的内部细节。希望我们也能开始利用 Keras 从头构建一个模型。 -If you liked this posts be sure to recommend it so others can see it. You can also follow this profile to keep up with my process in the Fast AI course. See you there! +同时,感谢所有更新 GitHub 脚本的人,这可帮了大忙!另外也要感谢所有参与 Fast AI 论坛的人,你们太棒了。 +如果你喜欢这篇文章,请把它推荐给你的朋友们,让更多人的看到它。你也可以按照这篇文章,跟上我在 Fast AI 课程中的进度。到时候那里见! --- > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)。 + From de6161de97f62b976403f7258bc69fe1b4210ced Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Mon, 7 Aug 2017 22:14:44 +0800 Subject: [PATCH 02/15] :sparkles: Create understanding-service-workers.md --- TODO/understanding-service-workers.md | 410 ++++++++++++++++++++++++++ 1 file changed, 410 insertions(+) create mode 100644 TODO/understanding-service-workers.md diff --git a/TODO/understanding-service-workers.md b/TODO/understanding-service-workers.md new file mode 100644 index 00000000000..bb16447edb0 --- /dev/null +++ b/TODO/understanding-service-workers.md @@ -0,0 +1,410 @@ + + > * 原文地址:[Understanding Service Workers](http://blog.88mph.io/2017/07/28/understanding-service-workers/) + > * 原文作者:[Adnan Chowdhury](http://blog.88mph.io/author/adnan/) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/understanding-service-workers.md](https://github.com/xitu/gold-miner/blob/master/TODO/understanding-service-workers.md) + > * 译者: + > * 校对者: + + # Understanding Service Workers + + What are Service Workers? What can they do, and how can make your web app perform better? This article sets out to answer those questions, plus how to implement them using the Ember.js framework. + +## Table of Contents + +- [Background](#background) +- [Registration](#registration) +- [Install Event](#installevent) +- [Fetch Event](#fetchevent) +- [Caching Strategies](#cachingstrategies) +- [Activate Event](#activateevent) +- [Sync Event](#syncevent) +- [When is the Sync Event fired?](#whenisthesynceventfired) +- [Push Notifications](#pushnotifications) +- [Notifications](#notifications) +- [Push messaging](#pushmessaging) +- [Implementing Using Ember.js](#implementingusingemberjs) +- [Understanding ember-service-worker Conventions](#understandingemberserviceworkerconventions) +- [Build your Ember App w/ Service Workers](#buildyouremberappwserviceworkers) +- [Conclusion](#conclusion) + +## Background + +In a time when the web was young, there was scarcely any thought given to how a web page should behave when a user was offline. You were just *always* online. + +![Connected!](http://blog.88mph.io/content/images/2017/07/aol-connected.jpg) + +Connected! The gang's all here! Don't ever leave. + +But with the advent of mobile internet, and with the rest of the world catching up, spotty internet connections have become increasingly commonplace across users of the modern web. + +Consequently, it has become valuable for websites to take ownership of how they behave offline so that users are not limited by network availability. + +[AppCache](https://developer.mozilla.org/en-US/docs/Web/HTML/Using_the_application_cache) was initially introduced as part of the HTML5 spec as a solution for offline web applications. It consisted of a combination of HTML and JS that centered around a *cache manifest*, a configuration file written in a declarative language. + +AppCache was eventually found to be [unwieldy and full of gotchas](https://alistapart.com/article/application-cache-is-a-douchebag). It has since been deprecated and effectively replaced by Service Workers. + +[Service workers](https://developer.mozilla.org/en-US/docs/Web/API/Service_Worker_API) provide a more future-proof solution to the offline problem, by replacing AppCache's declarative style of implementation with a more imperative, procedural one. + +Service Workers are a way to execute code in a persistent, background process contained in the web browser. The code is event-driven, meaning the events that fire in the scope of a Service Worker are what drives its behavior. + +The rest of this article is a brief explanation for each of those events. But to begin utilizing Service Workers, you will first need to implement code in your front-facing web app that registers the Service Worker. + +## Registration + +The code below illustrates how to **register** your Service Worker in the client's browser. This is accomplished by having the following `register` call executed somewhere on your front-facing web app: + +``` +if (navigator.serviceWorker) { + navigator.serviceWorker.register('/sw.js') + .then(registration => { + console.log('congrats. scope is: ', registration.scope); + }) + .catch(error => { + console.log('sorry', error); + }); +} +``` + +This will tell the browser where to find your Service Worker implementation. The browser will look for the file (`/sw.js`) and save it as a Service Worker under the domain that is being accessed. This file will contain all of the event handlers that will define your Service Worker. + +![](http://blog.88mph.io/content/images/2017/07/Screenshot-2017-07-16-17.39.10.png) + +A registered Service Worker in Chrome DevTools + +It will also set the **scope** of your ServiceWorker. The filename `/sw.js` implies that the scope of the SW is the root path of your URL (or `http://localhost:3000/`). This means any requests that are made under the root path of your URL will be made visible to the SW via fired events. A filename such as `/js/sw.js` would capture requests only under `http://localhost:3000/js`. + +Alternatively, you could explicitly set the scope of your SW by passing a second argument to the `register` method: +`navigator.serviceWorker.register('/sw.js', { scope: '/js' })`. + +## Event Handlers + +Now that your Service Worker is registered, it's time to implement the event handlers that are triggered during the lifetime of your Service Worker. + +#### Install Event + +The install event is fired when your Service Worker registers for the first time, and any time after that when your Service Worker file (`/sw.js`) is updated (the browser will automatically detect changes). + +The install event is useful for logic you want to execute during the initialization of your Service Worker, i.e. a one-off operation that sets things up for the life of your Service Worker. A common use case is to load the cache during the install step. + +Here is an example of an install event handler that will add data to the cache. + +``` +const CACHE_NAME = 'cache-v1'; +const urlsToCache = [ + '/', + '/js/main.js', + '/css/style.css', + '/img/bob-ross.jpg', +]; + +self.addEventListener('install', event => { + caches.open(CACHE_NAME) + .then(cache => { + return cache.addAll(urlsToCache); + }); +}); +``` + +`urlsToCache` contains a list of URLs we want to add to the cache. + +`caches` is a global [CacheStorage](https://developer.mozilla.org/en-US/docs/Web/API/CacheStorage) object that allows you to manage your caches in the browser. We call `open` to retrieve the specific [Cache](https://developer.mozilla.org/en-US/docs/Web/API/Cache) object we want to work with. + +`cache.addAll` will take a list of URLs, make a request to each, and then store the response in its cache. It uses the request body as a key for each cache value. Read more at the [addAll](https://developer.mozilla.org/en-US/docs/Web/API/Cache/addAll) docs. + +![](http://blog.88mph.io/content/images/2017/07/Screenshot-2017-07-16-20.09.42.png) + +Cached data in Chrome DevTools + +#### Fetch event + +The **fetch** event is fired every time the web page makes a request. When it fires, your Service Worker has the ability to 'intercept' the request and decide what to return - whether that be cached data, or the response to an actual network request. + +The following example illustrates a *cache-first* strategy: any cached data that matches the request will be sent off first, without a network request. Only if there is no existing cached data will a network request be made. + +``` +self.addEventListener('fetch', event => { + const { request } = event; + const findResponsePromise = caches.open(CACHE_NAME) + .then(cache => cache.match(request)) + .then(response => { + if (response) { + return response; + } + + return fetch(request); + }); + + event.respondWith(findResponsePromise); +}); +``` + +`request` contains the request body that is included in the [FetchEvent](https://developer.mozilla.org/en-US/docs/Web/API/FetchEvent) object. It is used to lookup a matching response in the cache. + +`cache.match` will try to find a cached response that matches the specified request. If it finds nothing, the promise will resolve with `undefined`. We check for this, and make a `fetch` call in this case, which makes a network request and returns a promise. + +`event.respondWith` is a method specifically on a FetchEvent object that we use to send a response back to the browser for the request. It accepts a Promise that resolves to a response (or network error). + +###### Caching Strategies + +The fetch event is particularly important because it's where you can define your *caching strategy*. That is, how you determine when to use cached data, and when to use network-sourced data. + +The beauty in Service Workers is that it is a low-level API for intercepting requests and lets you decide what response to provide for them. This allows us the freedom to implement our own strategy for providing cached or network-sourced content. There are several basic caching strategies that you could employ when trying to implement the best one for your web app. + +Mozilla has a [handy resource](https://serviceworke.rs/caching-strategies.html) that documents several different caching strategies. There is also [The Offline Cookbook](https://developers.google.com/web/fundamentals/instant-and-offline/offline-cookbook) written by Jake Archibald that outlines some of the same caching strategies, and more. + +In an above example, we demonstrated a basic **cache-first** strategy. The following is an example which I've found applicable in my own projects: a **cache and update** strategy. This method will let the cache respond first, but subsequently make a network request in the background. The response from this background request is used to update the value in the cache so that an updated response is provided the next time it is accessed. + +``` +self.addEventListener('fetch', event => { + const { request } = event; + + event.respondWith(caches.open(CACHE_NAME) + .then(cache => cache.match(request)) + .then(matching => matching || fetch(request))); + + event.waitUntil(caches.open(CACHE_NAME) + .then(cache => fetch(request) + .then(response => cache.put(request, response)))); +}); +``` + +`event.respondWith` is used to provide a response to the request. Here we are opening the cache and finding a matching response. If it doesn't exist, we reach out to the network. + +Subsequently, we call `event.waitUntil` to allow the async Promise to resolve before the Service Worker context is terminated. Here we make a network request, and then cache the response. Once this asynchronous operation is finished, `waitUntil` will resolve and the operation will terminate. + +#### Activate Event + +The activate event is a slightly less documented event, but is important for when you are updating your Service Worker file and need to execute any clean up or maintenance from the previous version of your Service worker. + +When you update your Service Worker file (`/sw.js`), the browser will detect changes and display this in Chrome DevTools: + +![](http://blog.88mph.io/content/images/2017/07/Screenshot-2017-07-18-08.29.32.png) + +Your new Service Worker is 'waiting to activate'. + +When the actual web page is closed, and re-opened again, the browser will replace the old Service Worker with the new one, and fire the **activate** event, after the **install** event. If you needed to clean up the caches or perform maintenance regarding the old version of your Service Worker, the activate event allows you the perfect time to do this. + +#### Sync event + +The sync event allows the deferring of network tasks until the user has connectivity. The feature it implements is commonly referred to as **background sync**. This is useful for ensuring that any network-dependent tasks that a user kicks off during offline mode will eventually reach their intended destination when the network is available again. + +Here is an example of what a background sync implementation would look like. You'll need code in your front-facing JS that registers a sync event, accompanied by a sync event handler in your Service Worker: + +``` +// app.js +navigator.serviceWorker.ready + .then(registration => { + document.getElementById('submit').addEventListener('click', () => { + registration.sync.register('submit').then(() => { + console.log('sync registered!'); + }); + }); + }); +``` + +Here we are assigning a click event to a button that will call `sync.register` on the [ServiceWorkerRegistration](https://developer.mozilla.org/en-US/docs/Web/API/ServiceWorkerRegistration) object. + +Basically, any operation that you want to ensure reaches the network either immediately or eventually when the network comes online, needs to be registered as a sync event. + +This could be something like POSTing a comment, or fetching user data, which will be defined in the Service Worker's event handler: + +``` +// sw.js +self.addEventListener('sync', event => { + if (event.tag === 'submit') { + console.log('sync!'); + } +}); +``` + +Here we are listening for a sync event, and checking for the `tag` on the [SyncEvent](https://developer.mozilla.org/en-US/docs/Web/API/SyncEvent) object to see if it matches the `'submit'` tag we specified for the click event. + +If multiple sync's under the `'submit'` tag are registered, the sync event handler will only execute once. + +So for this example, if the user were offline, and clicked the button seven times, when the network returned, all sync registrations would consolidate and the sync event would fire just once. + +In the case you would want separate syncs for each click event, you would register syncs under unique tags. + +###### When is the Sync Event fired? + +If the user is online, then the sync event will fire immediately and accomplish whatever task you've defined without delay. + +If the user is offline, the sync event will fire as soon as network connectivity is regained. + +If you're like me, and want to try this out in Chrome, be sure to actually disconnect your internet by disabling your Wi-Fi or otherwise network adapter. Toggling the Network checkbox in Chrome DevTools will not trigger sync events. + +For more information, you can read [this explainer document](https://github.com/WICG/BackgroundSync/blob/master/explainer.md), as well as this [introduction to background syncs](https://developers.google.com/web/updates/2015/12/background-sync). The sync event is largely unimplemented across browsers (only in Chrome at the time of this writing), and is bound to undergo changes, so stay tuned. + +#### Push Notifications + +Push notifications are a feature that are enabled by Service Workers by exposing the `push` event to Service Workers, as well as the [Push API](https://developer.mozilla.org/en-US/docs/Web/API/Push_API) implemented by the browser. + +When speaking about Web Push Notifications, there are actually two technologies at work: Notifications & Push Messaging. + +###### Notifications + +Notifications are pretty straightforward feature to implement with Service Workers: + +``` +// app.js +// ask for permission +Notification.requestPermission(permission => { + console.log('permission:', permission); +}); + +// display notification +function displayNotification() { + if (Notification.permission == 'granted') { + navigator.serviceWorker.getRegistration() + .then(registration => { + registration.showNotification('this is a notification!'); + }); + } +} +``` + +``` +// sw.js +self.addEventListener('notificationclick', event => { + // notification click event +}); + +self.addEventListener('notificationclose', event => { + // notification closed event +}); +``` + +You first need to ask permission from the user to enable notifications for your web page. From then on, you are able to toggle on notifications, and handle certain events, such as when a notification is closed by the user. + +###### Push Messaging + +Push messaging involves utilizing the Push API provided by the browser, coupled with backend implementation. An entirely separate article could be written on the implementation of Push API, but the basic gist is: + +![Push API Diagram](http://blog.88mph.io/content/images/2017/07/push-api.svg) + +It is an involved and slightly complicated process, and is outside the scope of this article. But if you'd like to learn more, this [introduction to push notifications](https://developers.google.com/web/ilt/pwa/introduction-to-push-notifications) is an informative read. + +## Implementing Using Ember.js + +Implementing Service Workers for your Ember app is incredibly easy. By virtue of [ember-cli](https://ember-cli.com/) and the [Ember Add-ons](https://www.emberaddons.com) community, you can equip your web app with Service Workers in plug-and-play fashion. + +This is made possible in part by the [ember-service-worker](https://github.com/DockYard/ember-service-worker) add-on, provided by the folks at DockYard (docs [here](http://ember-service-worker.com/documentation/getting-started/)). + +**ember-service-worker** sets up a modular architecture that can be used to plug in other ember-service-worker-* add-ons, such as [ember-service-worker-index](https://github.com/DockYard/ember-service-worker-index) or [ember-service-worker-asset-cache](https://github.com/DockYard/ember-service-worker-asset-cache). These add-ons implement different parts of behavior and caching strategies to make up your Service Worker. + +#### Understanding `ember-service-worker` conventions + +All of the **ember-service-worker-*** add-ons follow a convention, in that their core logic is stored in one of two folders in the root directory of the add-on, `/service-worker` and `/service-worker-registration`: + + node_modules/ember-service-worker + ├── ... + ├── package.json + ├── service-worker + └── index.js + └── service-worker-registration + └── index.js + + +`/service-worker` is where the main implementation of your Service Worker is located (what you would store in `sw.js` as shown earlier). + +`/service-worker-registration` holds the logic you need to run in your front-facing code, where Service Worker registration would take place. + +Let's take a look at the `/service-worker` implementation for **ember-service-worker-index** (code [here](https://github.com/DockYard/ember-service-worker-index/blob/master/service-worker/index.js)) to divulge what it actually does: + +``` +import { + INDEX_HTML_PATH, + VERSION, + INDEX_EXCLUDE_SCOPE +} from 'ember-service-worker-index/service-worker/config'; + +import { urlMatchesAnyPattern } from 'ember-service-worker/service-worker/url-utils'; +import cleanupCaches from 'ember-service-worker/service-worker/cleanup-caches'; + +const CACHE_KEY_PREFIX = 'esw-index'; +const CACHE_NAME = `${CACHE_KEY_PREFIX}-${VERSION}`; + +const INDEX_HTML_URL = new URL(INDEX_HTML_PATH, self.location).toString(); + +self.addEventListener('install', (event) => { + event.waitUntil( + fetch(INDEX_HTML_URL, { credentials: 'include' }).then((response) => { + return caches + .open(CACHE_NAME) + .then((cache) => cache.put(INDEX_HTML_URL, response)); + }) + ); +}); + +self.addEventListener('activate', (event) => { + event.waitUntil(cleanupCaches(CACHE_KEY_PREFIX, CACHE_NAME)); +}); + +self.addEventListener('fetch', (event) => { + let request = event.request; + let isGETRequest = request.method === 'GET'; + let isHTMLRequest = request.headers.get('accept').indexOf('text/html') !== -1; + let isLocal = new URL(request.url).origin === location.origin; + let scopeExcluded = urlMatchesAnyPattern(request.url, INDEX_EXCLUDE_SCOPE); + + if (isGETRequest && isHTMLRequest && isLocal && !scopeExcluded) { + event.respondWith( + caches.match(INDEX_HTML_URL, { cacheName: CACHE_NAME }) + ); + } +}); +``` + +Without getting bogged down in the details, we can see that this code is basically implementing three of the event handlers we've talked about: `install`, `activate` and `fetch`. + +In the `install` event handler, we are fetching `INDEX_HTML_URL`, and then calling `cache.put` to store the response. + +`activate` does some rudimentary clean up. + +In the `fetch` handler, we are checking to see if `request` meets several conditions (is it a `GET` request; is it asking for HTML; is it local; etc.) and if it satisfies those conditions, we respond with what is stored in the cache. + +Notice we're calling `cache.match` and using `INDEX_HTML_URL` to look up the value, and not `request.url`. This means we'd always look up the same cache key, no matter what the actual URL is. + +This is because an Ember app will always render using `index.html`. Any URL requests that are under the root URL of the app will end up with a cached version of `index.html`, where the Ember app would normally take over. That is the purpose of **ember-service-worker-index** - to cache `index.html`. + +Similarly, [**ember-service-worker-asset-cache**](https://github.com/DockYard/ember-service-worker-asset-cache) will cache all the assets found in the `/assets` folder by implementing its own `install` and `fetch` event handlers. + +There are [several add-ons](https://www.emberaddons.com/?query=service-worker) that employ **ember-service-worker** architecture and allow you to customize and fine tune your Service Worker's behavior and caching strategies. + +#### Build your Ember App w/ Service Workers + +First, you'll need [ember-cli](https://ember-cli.com/) installed. Then execute the following commands: + +``` +$ ember new new-app +$ cd new-app +$ ember install ember-service-worker +$ ember install ember-service-worker-index +$ ember install ember-service-worker-asset-cache +``` + + +Your app is now serviced by Service Workers and by default will have `index.html` and `/assets/**/*` cached. + +You can fine tune what files under the `/assets` folder will get cached via `config/environment.js`. + +If you find that none of the existing ember-service-worker add-ons solve your problem, you can create your own following the [docs at the ember-service-worker website](http://ember-service-worker.com/documentation/authoring-plugins/). + +## Conclusion + +I hope you have gained a firmer understanding of Service Workers, and their underlying architecture, and also how web apps can utilize them to create a better experience for users. + +`ember-service-worker` add-ons allow you implement them easily in your Ember.js web app. If you find that you need to implement your own logic for a Service Worker, it should be easy to create your own add-on that implements the event handlers you need to implement the behavior you want. This is something I'd like to tackle in the near future, so stay tuned! + +#### From our Sponsors + +![](http://blog.88mph.io/content/images/2017/07/Quartzy-logo.png) + +*If you are interested in working with Ember.js full-time, [Quartzy](https://www.quartzy.com/) is hiring frontend devs! We help scientists around the world by helping them save money and be more efficient in the lab. Apply [here](http://grnh.se/coe8yp1).* + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From 8f33319d34a04cb1751baf6457b9aee62e170501 Mon Sep 17 00:00:00 2001 From: sqrtthree Date: Tue, 8 Aug 2017 17:06:45 +0800 Subject: [PATCH 03/15] =?UTF-8?q?:rocket:=20=E6=B7=BB=E5=8A=A0=E6=96=87?= =?UTF-8?q?=E7=AB=A0=E3=80=8E=E6=B8=90=E8=BF=9B=E5=A2=9E=E5=BC=BA=E7=9A=84?= =?UTF-8?q?=20CSS=20=E5=B8=83=E5=B1=80=EF=BC=9A=E4=BB=8E=E6=B5=AE=E5=8A=A8?= =?UTF-8?q?=E5=88=B0=20Flexbox=20=E5=88=B0=20Grid=E3=80=8F=E5=88=B0?= =?UTF-8?q?=E6=96=87=E7=AB=A0=E5=88=97=E8=A1=A8?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 ++-- front-end.md | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index c583b067b2f..59a619d5906 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ [掘金翻译计划](https://juejin.im/tag/%E6%8E%98%E9%87%91%E7%BF%BB%E8%AF%91%E8%AE%A1%E5%88%92) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](#android)、[iOS](#ios)、[React](#react)、[前端](#前端)、[后端](#后端)、[产品](#产品)、[设计](#设计) 等领域,读者为热爱新技术的新锐开发者。 -掘金翻译计划目前翻译完成 [600](#近期文章列表) 篇文章,共有 [350](https://github.com/xitu/gold-miner/wiki/%E8%AF%91%E8%80%85%E7%A7%AF%E5%88%86%E8%A1%A8) 余名译者贡献翻译。 +掘金翻译计划目前翻译完成 [601](#近期文章列表) 篇文章,共有 [350](https://github.com/xitu/gold-miner/wiki/%E8%AF%91%E8%80%85%E7%A7%AF%E5%88%86%E8%A1%A8) 余名译者贡献翻译。 # 官方指南 @@ -49,10 +49,10 @@ ## 前端 +* [渐进增强的 CSS 布局:从浮动到 Flexbox 到 Grid](https://juejin.im/post/5987acfd6fb9a03c502288f3?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([leviding](https://github.com/leviding) 翻译) * [Web 端的下一代三维图形](https://juejin.im/post/5983208c5188253c6f2d185d?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([reid3290](https://github.com/reid3290) 翻译) * [在大型应用中使用 Redux 的五个技巧](https://juejin.im/post/5980514151882537b41c4c0d?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([loveky](https://github.com/loveky) 翻译) * [在 CSS 中使用特征查询](https://juejin.im/post/58eb3004ac502e006c45454b?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([sunshine940326](https://github.com/sunshine940326) 翻译) -* [构建渐进式 Web 应用入门指南](https://juejin.im/entry/5979666af265da3e161a6402/detail?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([AceLeeWinnie](https://github.com/AceLeeWinnie) 翻译) * [所有前端译文>>](https://github.com/xitu/gold-miner/blob/master/front-end.md) ## React diff --git a/front-end.md b/front-end.md index 29f04ea82fc..e4a07721edb 100644 --- a/front-end.md +++ b/front-end.md @@ -1,3 +1,4 @@ +* [渐进增强的 CSS 布局:从浮动到 Flexbox 到 Grid](https://juejin.im/post/5987acfd6fb9a03c502288f3?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([leviding](https://github.com/leviding) 翻译) * [Web 端的下一代三维图形](https://juejin.im/post/5983208c5188253c6f2d185d?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([reid3290](https://github.com/reid3290) 翻译) * [在大型应用中使用 Redux 的五个技巧](https://juejin.im/post/5980514151882537b41c4c0d?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([loveky](https://github.com/loveky) 翻译) * [在 CSS 中使用特征查询](https://juejin.im/post/58eb3004ac502e006c45454b?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([sunshine940326](https://github.com/sunshine940326) 翻译) From 1738b5e3c6a95788fd0fbeb8acc8ffb2312f4b9c Mon Sep 17 00:00:00 2001 From: sqrtthree Date: Tue, 8 Aug 2017 17:08:34 +0800 Subject: [PATCH 04/15] =?UTF-8?q?:rocket:=20=E6=B7=BB=E5=8A=A0=E6=96=87?= =?UTF-8?q?=E7=AB=A0=E3=80=8E=E6=B7=B1=E5=BA=A6=E5=AD=A6=E4=B9=A0=E7=B3=BB?= =?UTF-8?q?=E5=88=971=EF=BC=9A=E8=AE=BE=E7=BD=AE=20AWS=20&=20=E5=9B=BE?= =?UTF-8?q?=E5=83=8F=E8=AF=86=E5=88=AB=E3=80=8F=E5=88=B0=E6=96=87=E7=AB=A0?= =?UTF-8?q?=E5=88=97=E8=A1=A8?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- AI.md | 1 + README.md | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/AI.md b/AI.md index c639df90680..d05cd5f5652 100644 --- a/AI.md +++ b/AI.md @@ -1,3 +1,4 @@ +* [深度学习系列1:设置 AWS & 图像识别](https://juejin.im/post/5987f5885188256dcf65d01e?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([lileizhenshuai](https://github.com/lileizhenshuai) 翻译) * [深度学习的未来](https://juejin.im/post/597843506fb9a06ba4747db5?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([changkun](https://github.com/changkun) 翻译) * [论深度学习的局限性](https://juejin.im/post/5978352a6fb9a06bad6574a4?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([CACppuccino](https://github.com/CACppuccino) 翻译) * [使用 Python+spaCy 进行简易自然语言处理](https://juejin.im/post/5971a4b9f265da6c42353332?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([lsvih](https://github.com/lsvih) 翻译) diff --git a/README.md b/README.md index 59a619d5906..32e14b33748 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ [掘金翻译计划](https://juejin.im/tag/%E6%8E%98%E9%87%91%E7%BF%BB%E8%AF%91%E8%AE%A1%E5%88%92) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](#android)、[iOS](#ios)、[React](#react)、[前端](#前端)、[后端](#后端)、[产品](#产品)、[设计](#设计) 等领域,读者为热爱新技术的新锐开发者。 -掘金翻译计划目前翻译完成 [601](#近期文章列表) 篇文章,共有 [350](https://github.com/xitu/gold-miner/wiki/%E8%AF%91%E8%80%85%E7%A7%AF%E5%88%86%E8%A1%A8) 余名译者贡献翻译。 +掘金翻译计划目前翻译完成 [602](#近期文章列表) 篇文章,共有 [350](https://github.com/xitu/gold-miner/wiki/%E8%AF%91%E8%80%85%E7%A7%AF%E5%88%86%E8%A1%A8) 余名译者贡献翻译。 # 官方指南 @@ -25,10 +25,10 @@ ## AI / Deep Learning / Machine Learning +* [深度学习系列1:设置 AWS & 图像识别](https://juejin.im/post/5987f5885188256dcf65d01e?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([lileizhenshuai](https://github.com/lileizhenshuai) 翻译) * [深度学习的未来](https://juejin.im/post/597843506fb9a06ba4747db5?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([changkun](https://github.com/changkun) 翻译) * [论深度学习的局限性](https://juejin.im/post/5978352a6fb9a06bad6574a4?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([CACppuccino](https://github.com/CACppuccino) 翻译) * [使用 Python+spaCy 进行简易自然语言处理](https://juejin.im/post/5971a4b9f265da6c42353332?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([lsvih](https://github.com/lsvih) 翻译) -* [从金属巨人到深度学习](https://juejin.im/post/596f4cecf265da6c2f0adb04?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([XatMassacrE](https://github.com/XatMassacrE) 翻译) * [所有 AI 译文>>](https://github.com/xitu/gold-miner/blob/master/AI.md) ## Android From 28a9ef37177138c36bab7ad82ca8b576228aab0c Mon Sep 17 00:00:00 2001 From: sqrtthree Date: Tue, 8 Aug 2017 17:13:21 +0800 Subject: [PATCH 05/15] =?UTF-8?q?:rocket:=20=E6=B7=BB=E5=8A=A0=E6=96=87?= =?UTF-8?q?=E7=AB=A0=E3=80=8E=E8=AE=BE=E8=AE=A1=E4=BD=9C=E5=93=81=E9=9B=86?= =?UTF-8?q?=E7=BD=91=E7=AB=99=E7=9A=84=E7=9C=9F=E6=AD=A3=E8=A7=92=E8=89=B2?= =?UTF-8?q?=E6=98=AF=E4=BB=80=E4=B9=88=EF=BC=9F=E3=80=8F=E5=88=B0=E6=96=87?= =?UTF-8?q?=E7=AB=A0=E5=88=97=E8=A1=A8?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 ++-- design.md | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 32e14b33748..6d750591390 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ [掘金翻译计划](https://juejin.im/tag/%E6%8E%98%E9%87%91%E7%BF%BB%E8%AF%91%E8%AE%A1%E5%88%92) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](#android)、[iOS](#ios)、[React](#react)、[前端](#前端)、[后端](#后端)、[产品](#产品)、[设计](#设计) 等领域,读者为热爱新技术的新锐开发者。 -掘金翻译计划目前翻译完成 [602](#近期文章列表) 篇文章,共有 [350](https://github.com/xitu/gold-miner/wiki/%E8%AF%91%E8%80%85%E7%A7%AF%E5%88%86%E8%A1%A8) 余名译者贡献翻译。 +掘金翻译计划目前翻译完成 [603](#近期文章列表) 篇文章,共有 [350](https://github.com/xitu/gold-miner/wiki/%E8%AF%91%E8%80%85%E7%A7%AF%E5%88%86%E8%A1%A8) 余名译者贡献翻译。 # 官方指南 @@ -82,10 +82,10 @@ ## 设计 +* [设计作品集网站的真正角色是什么?](https://juejin.im/post/598959b65188253d2968eaab?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([noturnot](https://github.com/noturnot) 翻译) * [原子设计:如何设计组件系统](https://juejin.im/post/59780066f265da6c3433872f?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([H2O-2](https://github.com/H2O-2) 翻译) * [为企业应用设计更好的表格](https://juejin.im/post/5976ecb65188250c855facc2?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([laiyun90](https://github.com/laiyun90) 翻译) * [UX 基于背后的合理化,而非设计](https://juejin.im/post/5971ce0d51882574623352ca?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([horizon13th](https://github.com/horizon13th) 翻译) -* [以排印为本,从内容出发](https://juejin.im/entry/5965c5b26fb9a06ba025074c/detail?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([cdpath](https://github.com/cdpath) 翻译) * [所有设计译文>>](https://github.com/xitu/gold-miner/blob/master/design.md) diff --git a/design.md b/design.md index acd93e4be23..554801ac786 100644 --- a/design.md +++ b/design.md @@ -1,3 +1,4 @@ +* [设计作品集网站的真正角色是什么?](https://juejin.im/post/598959b65188253d2968eaab?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([noturnot](https://github.com/noturnot) 翻译) * [原子设计:如何设计组件系统](https://juejin.im/post/59780066f265da6c3433872f?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([H2O-2](https://github.com/H2O-2) 翻译) * [为企业应用设计更好的表格](https://juejin.im/post/5976ecb65188250c855facc2?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([laiyun90](https://github.com/laiyun90) 翻译) * [UX 基于背后的合理化,而非设计](https://juejin.im/post/5971ce0d51882574623352ca?utm_source=gold-miner&utm_medium=readme&utm_campaign=github) ([horizon13th](https://github.com/horizon13th) 翻译) From 649bbccf05355e7c332ab9b9a1bad08bbc73cb63 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 14:28:09 +0800 Subject: [PATCH 06/15] :sparkles: Create why-i-havent-fixed-your-issue-yet.md --- TODO/why-i-havent-fixed-your-issue-yet.md | 59 +++++++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 TODO/why-i-havent-fixed-your-issue-yet.md diff --git a/TODO/why-i-havent-fixed-your-issue-yet.md b/TODO/why-i-havent-fixed-your-issue-yet.md new file mode 100644 index 00000000000..2be91b1f01a --- /dev/null +++ b/TODO/why-i-havent-fixed-your-issue-yet.md @@ -0,0 +1,59 @@ + + > * 原文地址:[Why I Haven’t Fixed Your Issue Yet](https://medium.com/@michlbrmly/why-i-havent-fixed-your-issue-yet-a24ab4bc0d55) + > * 原文作者:[Michael Bromley](https://medium.com/@michlbrmly) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/why-i-havent-fixed-your-issue-yet.md](https://github.com/xitu/gold-miner/blob/master/TODO/why-i-havent-fixed-your-issue-yet.md) + > * 译者: + > * 校对者: + + # Why I Haven’t Fixed Your Issue Yet + + ![](https://cdn-images-1.medium.com/max/1600/0*sBJnwQRCh05t-nXu.jpg) + +Hi there. You opened an issue with my project on GitHub, and it’s getting kind of stale by now. + +I am aware of it — GitHub was kind enough to send me an email containing your report, which I scanned one morning a couple of weeks ago while I ate breakfast. I’ve even thought about it briefly a couple of times since then; once I was in the shower and I got the vague idea that I knew what caused it — but I wasn’t sure because I could not recall the specifics. + +Of course, you knew none of this. You may have wondered if your issue — which may be critical to your current project — has been lost to the void. Allow me a few minutes to explain why you’ve not heard anything from me. + +A couple of years ago, I was a freelancer and father to a new baby. The freelance hours provided me with flexibility and the baby was conveniently immobile and fairly docile. I started writing libraries and publishing them on GitHub. Seeing people use my code was and is exciting and rewarding. Collecting stars on GitHub is a guilty pleasure just as with any other kind of “fake internet points”. I had plenty of time to work on issues and improvements, and I would generally respond to (and often fix) issues within a day or two. + +Today, I am in full-time employment. The baby is now a toddler, and there is also another baby. Toddlers are neither docile nor immobile. If I am lucky, I can carve out an hour of free time per day — generally between 9pm and 10pm. + +Do you know what I like to do in that time? Unfortunately for you, the answer is not *“fire up my IDE, get the build pipeline going, start a local dev server, and try to recreate someone else’s issue”*. And I don’t mean to berate; I am simply stating fact. My weary evening mind is just not up to the task most days. Usually I like to sit on the couch for a bit, and just enjoy sitting. + +So where does that leave you, user of my library? Do I no longer care about your plight? Have you done your company a disservice by using my library in your project? In this *Free and Open Source Software* (FOSS) reality we live in, how many parts of your company’s product are coupled to the lifestyle and priorities of some lone, unpaid package maintainer? It’s something I have to think about too — in my day-job I build software on top of many FOSS libraries, many of which are probably maintained by people in similar circumstances to my own. + +As with all things in life, a trade-off is involved. There is an implicit agreement which needs to be understood by both consumers and creators of FOSS projects¹. It goes something like this: + +- I agree to provide you with some free code which solves your problem. +- I recognize that in doing so, I have taken on a small portion of responsibility to you as a user of my code. +- I agree to try to help you if you have difficulty in using my code. +- I agree to try to fix bugs that you find in my code. +- Crucially, you agree that I, in acting without remuneration, am free to assign priority to the above points as I see fit. + +The last point is the reason why I haven’t fixed your issue yet. Your issue is competing for my attention with my work, my family, my couch, my other interests and of course with all the other issues that are still open — several of which are much older than yours. + +So, here is my message to you, to all the users of my FOSS projects, and to all developers who use and benefit from the FOSS ecosystem: + +> I will try my best to get around to it. I do want to help you. + +> Make things easier for me by reading and following as closely as possible the [issue template](https://github.com/michaelbromley/ng2-pagination/blob/master/ISSUE_TEMPLATE.md). + +> Take time to understand, research and debug your issue — don’t offload that burden onto me. + +> Understand if I do not reply in what seems like a reasonable time. And don’t be that one guy who was extremely rude and insulting (I’ll refrain from linking the issue for his sake). + +Thanks for reading, and happy coding. + +¹ This goes for projects which are promoted in some way. If you tell people “hey, you should use my thing”, then you’ve entered into this agreement with them. If you just throw your stuff up onto GitHub for the fun of it, then it doesn’t necessarily apply. + +--- + +*Originally published at *[*www.michaelbromley.co.uk*](https://www.michaelbromley.co.uk/blog/why-i-havent-fixed-your-issue-yet/)*.* + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From eb011adae5ff02d12bf35a0dcf627951b484cb29 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 14:44:18 +0800 Subject: [PATCH 07/15] :sparkles: Create how-vr-is-changing-ux-from-prototyping-to-device-design.md --- ...ng-ux-from-prototyping-to-device-design.md | 124 ++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 TODO/how-vr-is-changing-ux-from-prototyping-to-device-design.md diff --git a/TODO/how-vr-is-changing-ux-from-prototyping-to-device-design.md b/TODO/how-vr-is-changing-ux-from-prototyping-to-device-design.md new file mode 100644 index 00000000000..1b3c14f4148 --- /dev/null +++ b/TODO/how-vr-is-changing-ux-from-prototyping-to-device-design.md @@ -0,0 +1,124 @@ + + > * 原文地址:[How VR Is Changing UX: From Prototyping To Device Design](https://uxplanet.org/how-vr-is-changing-ux-from-prototyping-to-device-design-a75e6b45e5f8) + > * 原文作者:[Justinmind](https://uxplanet.org/@justinmind) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/how-vr-is-changing-ux-from-prototyping-to-device-design.md](https://github.com/xitu/gold-miner/blob/master/TODO/how-vr-is-changing-ux-from-prototyping-to-device-design.md) + > * 译者: + > * 校对者: + + # How VR Is Changing UX: From Prototyping To Device Design + + ![](https://cdn-images-1.medium.com/max/1600/1*MlFZkL6bQee0eoks7lGN_Q.png) + +## Virtual reality is changing the way we define user experience, but if one principal remains it’s that experiences must be centered around people + +Have you ever wondered what it might be like to travel into space? Or watch The Beatles perform in concert? With the latest developments in VR your dreams can become a simulated reality and you don’t even have to leave the comfort of your sofa. + +But what do these new technologies mean for user experience? With emerging platforms and rapid developments in technology, user experience is central for these technologies to succeed. Because if people are to adopt these new technologies in their daily lives, they need to be believable. + +As [Daniel Terdiman](https://www.fastcompany.com/3058259/for-oculus-to-succeed-vr-needs-to-succeed) points out + +> “[VR companies] are keenly aware that bad VR experiences on any platform or any device can turn people off to the entire technology forever.” + +Getting VR right is vital and that’s where UX comes in. + +### What is VR? + +Before understanding how VR is changing UX, let’s look at which technologies have emerged in recent years and define them. There is some confusion among the different terms so here is some clarity. + +To start, the three main technologies that alter reality are: + +#### Virtual reality (VR) + +Virtual reality creates a new world. A simulated reality, if you will. What VR does is transport the user to a different place, a place that is generated entirely by technology. If you need a mental image, think of VR headsets like Oculus Rift which create your own world for you. + +#### Augmented reality (AR) + +Then there’s AR. AR overlays a generated images or video on top of the reality. Think Pokémon Go or the Ikea [catalog app](https://www.youtube.com/watch?v=vDNzTasuYEw), which lets you see how their furniture would look in your home before you buy it. + +#### Mixed reality (MR) + +Mixed reality is just that. It combines both generated imagery and real-world objects. According to Keith Curtin, it’s [the most important tech of 2017](https://thenextweb.com/insider/2017/01/07/mixed-reality-will-be-most-important-tech-of-2017/#.tnw_1frSRiaM). What mixed reality does is give real-world presence to intelligent virtual objects. + +### Does prototyping play a role in VR? + +Since the user experience is integral to virtual reality, prototyping is essential in creating a believable VR experience. Getting a VR experience right is paramount and an interactive wireframe can get you closer to success with quick iterations. + +It’s necessary to define interactions and create a logical workflow, even when designing a VR experience. Even though it’s a virtual reality, UI design still takes center stage. + +While most of the design in VR is 3D, it’s still useful to prototype interfaces in 2D before you begin in 3D to save time and make incremental tweaks during user testing — you’d be surprised at where [prototyping can fit into a design process](https://www.justinmind.com/blog/how-to-improve-your-web-and-app-design-process-with-prototypes/). + +### Good UX for VR + +Bad experiences can hurt VR, so which principles are necessary to avoid any hiccups in a VR user experience? + +- Believable: An experience within VR must be believable. That means feeling as though you’re actually there +- Interactive: VR must be interactive to work well so when you extend your arm, the VR world must replicate those movements. +- Explorable: You must be able to walk (or fly…) around an environment. +- Immersive: Mix together exploration and believability and you get immersive — enjoying the experience from any angle. + +### VR UX success stories + +VR has myriad uses. One such use is helping senior citizens in assisted living. Rendever is a virtual reality company based in Massachusetts that uses VR to help older people enjoy life again. + +According to [Rendever](http://rendever.com/), 50% of residents in assisted living experience depression and isolation. The company sought to reduce this figure by using VR technology in an innovative way. + +### VR for senior citizens + +Imagine it: you’re a senior citizen who is incapacitated and unable to travel. But your granddaughter’s wedding is happening on the other side of the country. Normally this would result in a disappointed and upset grandparent who’s missing out on the big day. But with VR, sitting in the front row of the wedding is now possible thanks to VR. + +> ***“[Senior citizens] can experience powerful moments that a 2D picture won’t provide”*** + +### Travel during surgery with VR + +VR has uses in medicine, too. Surgeons have taken to VR to help their patients remain calm as they undergo important and life changing surgeries. + +Anesthetic is used to sedate patients but there are instances where this isn’t possible. To help reduce anxiety and stress during an operation, a private medical clinic in Mexico City uses VR headsets to transport patients to destinations like Machu Picchu in Peru to keep them distracted as they undergo treatment. And [it worked](https://mosaicscience.com/story/virtual-reality-VR-surgery-pain-mexico). + +### VR helps reduce pain among patients + +In California, psychologist Hunter Hoffman developed a VR game to aid pain reduction. SnowWorld attempts to direct a patient’s attention away from the pain and transports them into a snow filled world where they throw snowballs at penguins. Yep, really. + +What is notable about SnowWorld’s users was that they reported up to [50% less pain](https://thenextweb.com/insider/2017/05/09/study-vr-twice-as-effective-as-morphine-at-treating-pain/#.tnw_c6Wwxja2) than those who attempt other means to distract them from their pain — how’s that for a great user experience? + +### UX challenges when building VR experiences + +Users may fear trying VR if they’ve previously had a bad experience with it. So UX design, or better yet **good UX design**, must be central to any VR experience. That means every little detail must be considered, from proper lighting and fluid movements to realistic design. + +### Enhance the user experience in and out of virtual reality + +But UX goes beyond virtual reality. The device’s design plays a big role too. Nobody wants to wear a clunky headset that’s heavy and weighs them down. To enhance the VR experience, creating a lightweight and versatile headset is paramount otherwise users won’t be able to immerse themselves fully into virtual reality if all they have on their mind is a headset that’s causing them neck pain. + +### Make your VR experience believable + +One of the overriding challenges that VR presents is that an experience may not look or feel real. Can you imagine diving into the cool waters of the Caribbean only to find poorly designed fish and badly designed terrain? + +The UX design of a VR experience ought to be as convincing as possible. That means giving users full control of the experience so they’re in the driving seat. Interactions are a must in order that users can forget they’re in a simulated reality. To address this UX problem, many VR experiences are in 360° for a fully immersive experience. + +#### Virtual reality has real life consequences + +While VR simulates a reality, don’t forget that there are real life consequences when using VR. That means vomit-proofing your VR experiences. No, really. Motion sickness is a thing. + +One sure fire way to turn people off VR is to give them headaches and nausea. With the **UI design**, simply avoid any rapid movements or velocity changes to stop users from wanting to run to the nearest bucket. + +### But what can UXers do to VR proof their practice? + +First and foremost, for any UXer approaching VR, is to understand how the technology works. The [nitty gritty of AR](https://www.wareable.com/vr/how-does-vr-work-explained). That means brushing up on some new vocabulary. Such as differentiating head tracking from motion tracking and learning what HMD means. + +#### Getting up to date with 3D-related tools + +As UX practitioners, we need to stay on top of the latest developments in technology and this includes user testing methodologies. We’re used to [user testing](https://www.loop11.com/user-testing-a-mobile-app-prototype-essential-checklist/) our mobile app prototypes. + +When we create a new design, continuous and rigorous user testing helps us to gauge if something has worked. + +User testing in VR has its obvious drawbacks: it’s expensive, hard to supply multiple headsets to a large audience and testing something that’s attached to someone’s face is complicated. + +Understanding the methodologies behind [user testing with VR](https://omobono.com/insights/blog/designing-vr-how-conquer-challenges-user-testing-vr) is essential but conquering them isn’t impossible. Gone are the days of peeping over a user’s shoulder. + +Ultimately, VR opens up many doors for UX. But it’s UX with a twist. Pointing and clicking a mouse will seem kitsch when designing experiences that involve face and voice recognition, movement tracking and, potentially, brain waves. These are just a few of the new input methods that UXers will have to acquaint themselves with. + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From 2df7658826d6f2e0d04ae6efd180a119e80c7e7c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 14:52:18 +0800 Subject: [PATCH 08/15] :sparkles: Create ui-vs-ux-what-is-the-difference.md --- TODO/ui-vs-ux-what-is-the-difference.md | 97 +++++++++++++++++++++++++ 1 file changed, 97 insertions(+) create mode 100644 TODO/ui-vs-ux-what-is-the-difference.md diff --git a/TODO/ui-vs-ux-what-is-the-difference.md b/TODO/ui-vs-ux-what-is-the-difference.md new file mode 100644 index 00000000000..8caf55389e6 --- /dev/null +++ b/TODO/ui-vs-ux-what-is-the-difference.md @@ -0,0 +1,97 @@ + + > * 原文地址:[UI vs UX: What is the Difference?](https://www.sitepoint.com/ui-vs-ux-what-is-the-difference/) + > * 原文作者:[Darin Dimitroff](https://www.sitepoint.com/author/darin-dimitroff/) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/ui-vs-ux-what-is-the-difference.md](https://github.com/xitu/gold-miner/blob/master/TODO/ui-vs-ux-what-is-the-difference.md) + > * 译者: + > * 校对者: + + # UI vs UX: What is the Difference? + + As a full-stack/generalist product designer, I’ve watched the industry's obsession with titles for many years. Sure, we can keep being snarky about it, and continue throwing around funny tweets, but when did this help anyone? + +Just like “*Should designers code?*”, the “*UX vs UI*” question has turned into an inside joke. + +[![](https://ws3.sinaimg.cn/large/006tNc79ly1fidh1mes79j30qa0qetbr.jpg)](https://twitter.com/sdw/status/709853249407361024/photo/1) + +[![](https://ws2.sinaimg.cn/large/006tNc79ly1fidh23pw82j30pw104td8.jpg)](https://twitter.com/ezyjules/status/797121630287888384/photo/1) + +On one hand, this trend might help a more seasoned designer vent. In fact, there’s a [Tumblr entirely dedicated to it](https://shittyuiuxanalogies.tumblr.com/). + +However, there is a danger that it makes the industry look frustrating and inaccessible to a beginner. And that’s a shame because the barrier to entry for a new designer has never been so low (and I mean this in a completely positive way). + +## TL;DR + +**UI stands for User Interface.** It's what users interact with directly, everything they see, touch and hear within a piece of software or a website. It's the outermost layer of an app – the controls. + +In its current state – due to the types of devices we are using – UI design is *mostly* a visual discipline, although voice and written word are gaining more and more traction thanks to voice assistants and conversational interfaces. + +**UX stands for User eXperience.** It's a holistic term encapsulating each and every different kind of touchpoint a user has with a product. + +In the context of a digital product, this includes not only the software's front-end itself, but the whole technical stack, customer service, branding, public image of the company, availability, pricing, and communication, and that's certainly not all. + +UI is a *subset* of UX. Both terms have different meanings and, whenever possible, should not be used interchangeably. The title “UX/UI designer” makes little sense in terms of semantics. Every UI designer is a UX designer by definition, and being a UX designer without a more specific field of work is quite rare. Talking about “UX” without specifics can quickly render any conversation meaningless. + +## So, what’s the problem with “UI/UX designer”? + +As I see it, there are three major issues: + +1. It’s misleading for everyone: designers, developers, recruiters, founders, etc. More and more junior designers put it in their titles – and I can’t blame them. Everyone is doing it and being honest about your skill-set can sometimes make you feel like you’re letting yourself fall behind the title trends. +2. It introduces a false career path where being just a “UI designer” is not enough. If digital product design was like Pokemon, "UX/UI designer" wouldn't be the cooler, 'levelled-up' version of 'UI designer". +3. It greatly undermines and almost tokenizes the importance of “UX design”, which is actually the sum of everyone’s work. + +## UI is a *part* of UX + +In a nutshell, a user interface is the layer where human-computer interaction happens. Although an important one, it’s just one layer of the whole user experience stack, which encapsulates multiple disciplines. + +Let's use a real-world example: watching TV. + +The UX of watching TV includes the content quality, specifications of the TV set, location, furniture, your current state of mind and a lot more. On the other hand, the *UI of watching TV* is just a small part of that: the design and build quality of the remote and the on-screen menus. + +## About *those* images + +![UI vs UX analogies](https://dab1nmslvvntp.cloudfront.net/wp-content/uploads/2017/08/1501634649path-e1501833312222.jpg) +UI vs UX analogies. Credit: [http://digitalfractal.com/](http://digitalfractal.com/) + +Yes, *those* images. I don’t want to sound like an old man yelling at clouds, but most of the side-by-side pictures comparing UI and UX are just missing the point. Designer Sebastian de With expressed the same sentiment about a year ago with [this tweet](https://twitter.com/sdw/status/709853249407361024). + +Many of the recurring images (like the one with the two ketchup bottles and the one with the shortcut through a grass lane) not only lack meaning but are often counter-productive. They pit UI and UX against each other, implying that UI designers’ work is useless because people already use the product in another way. + +Let’s take the ketchup bottle example and fix it. We’ll only need one bottle because comparing two different designs and labeling that “UI vs UX” just doesn’t make sense: + +![UI vs UX analogies. Ketchup example](https://dab1nmslvvntp.cloudfront.net/wp-content/uploads/2017/08/15016346441-cYDgrGRLkIioJxkHUjrqaA.jpeg) + +- the **UI layer** is the bottle, including the cap and the label. This means ANY bottle – not just the upright version[1](#fn1). +- the **UX layer** is a combination of the company’s branding and marketing efforts, the nutritional and sensory qualities of the ketchup itself, the act of discovering and buying it in a store or online, the company’s interactions with customers on social media and literally every other touch point between the company and its existing and potential customers. + +## Where did the term UX come from? + +UI is a much older term than UX. It’s been used since the earliest days of computing, as it’s a generic science term. UX, on the other hand, got popular much later. + +The term UX was popularized in the mid-'90s by Don Norman, one of the co-founders of the Nielsen Norman Group and author of best-selling book “The Design of Everyday Things”. From the very beginning, the meaning has been quite clear, Don Norman even has a [whole video](https://www.youtube.com/watch?v=9BdtGjoIN4E) addressing the modern specifics of the problem. + +## UI is not just GUI: designing an API + +When we think about interfaces, we're normally referring to Graphical User Interfaces or GUIs. Almost all of the devices and systems we're using today are interacted with within a visual set of paradigms: windows, icons, buttons, navigation bars, sliders, inputs, etc. + +UI and GUI (graphical user interface) have become almost synonymous and for a good reason: most users don’t have to interact with a code-based tool in their day to day. Still, UI is not just GUI. + +In my day-to-day work, I’m focused on front-end design systems, so designing the developer-facing interface layer of a design system is part of my job. In that sense, a component’s API surface is its interface layer for people who work in code. Designing and documenting an API’s surface area is among the hardest tasks I’ve tackled over the years. + +Even if you don’t code, I can’t recommend spending some time on this sort of task, even as a personal challenge. Modern frameworks like React are doing a really great job suggesting and even enforcing the practice of designers and developers working together in a more meaningful way than sharing a Sketch file and calling it a day. + +## Saying no to false dichotomies + +While being specific about titles and semantics in the design industry is generally a positive thing, taking the “UI vs UX” discussion too far may well be having adverse effects. It’s a heated topic, but I completely agree with Jared Spool: *Everyone is a designer*. + +I can’t count the times I’ve been gifted fantastic UI ideas from developers or taken valuable user flow improvements from visual designers. Being aware of your title within an organization is important – but only to the point where it’s not stopping you from contributing in areas outside of it. + +--- + +1. Heinz continues to sell the upright glass bottles because of nostalgia and some people not trusting plastic packaging. [↩](#fnref1) + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From 028ddd3bc3685f2635a80e2db7fbed7920aa49ae Mon Sep 17 00:00:00 2001 From: lsvih Date: Wed, 9 Aug 2017 16:08:28 +0800 Subject: [PATCH 09/15] =?UTF-8?q?=E5=A6=82=E4=BD=95=E5=B0=86=E6=97=B6?= =?UTF-8?q?=E9=97=B4=E5=BA=8F=E5=88=97=E9=97=AE=E9=A2=98=E7=94=A8=20Python?= =?UTF-8?q?=20=E8=BD=AC=E6=8D=A2=E6=88=90=E4=B8=BA=E7=9B=91=E7=9D=A3?= =?UTF-8?q?=E5=AD=A6=E4=B9=A0=E9=97=AE=E9=A2=98=20(#1977)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...ries-supervised-learning-problem-python.md | 334 +++++++++--------- 1 file changed, 166 insertions(+), 168 deletions(-) diff --git a/TODO/convert-time-series-supervised-learning-problem-python.md b/TODO/convert-time-series-supervised-learning-problem-python.md index 3023b6a4fb7..6982a8258e4 100644 --- a/TODO/convert-time-series-supervised-learning-problem-python.md +++ b/TODO/convert-time-series-supervised-learning-problem-python.md @@ -3,38 +3,38 @@ > * 原文作者:[Dr. Jason Brownlee](http://machinelearningmastery.com/author/jasonb/) > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/convert-time-series-supervised-learning-problem-python.md](https://github.com/xitu/gold-miner/blob/master/TODO/convert-time-series-supervised-learning-problem-python.md) -> * 译者: +> * 译者:[lsvih](https://github.com/lsvih) > * 校对者: -# How to Convert a Time Series to a Supervised Learning Problem in Python +# 如何将时间序列问题用 Python 转换成为监督学习问题 -Machine learning methods like deep learning can be used for time series forecasting. +一些机器学习方法(例如深度学习)可以用于进行时间序列预测。 -Before machine learning can be used, time series forecasting problems must be re-framed as supervised learning problems. From a sequence to pairs of input and output sequences. +在使用这些机器学习方法前,必须先将时间序列预测问题转化为监督学习问题。也就是说,需要将一个时间序列转换成一组包含成对输入输出的序列。 -In this tutorial, you will discover how to transform univariate and multivariate time series forecasting problems into supervised learning problems for use with machine learning algorithms. +在这篇教程里,你将了解如何将单变量时间序列预测问题和多变量时间序列预测问题转换成监督学习问题,以使用机器学习算法。 -After completing this tutorial, you will know: +读完这篇教程,你将会了解: -- How to develop a function to transform a time series dataset into a supervised learning dataset. -- How to transform univariate time series data for machine learning. -- How to transform multivariate time series data for machine learning. +- 如何编写一个将时间序列数据集转换为监督学习数据集的函数。 +- 如何转换一元时间序列数据以使用机器学习。 +- 如何转换多元时间序列数据以使用机器学习。 -Let’s get started. +让我们开始吧。 ![How to Convert a Time Series to a Supervised Learning Problem in Python](http://3qeqpr26caki16dnhd19sv6by6v.wpengine.netdna-cdn.com/wp-content/uploads/2017/05/How-to-Convert-a-Time-Series-to-a-Supervised-Learning-Problem-in-Python.jpg) -How to Convert a Time Series to a Supervised Learning Problem in Python +题图:如何将时间序列问题用 Python 转换成为监督学习问题 -Photo by [Quim Gil](https://www.flickr.com/photos/quimgil/8490510169/), some rights reserved. +[Quim Gil](https://www.flickr.com/photos/quimgil/8490510169/) 拍摄,版权所有。 -## Time Series vs Supervised Learning +## 时间序列 vs 监督学习 -Before we get started, let’s take a moment to better understand the form of time series and supervised learning data. +在正式开始之前,让我们先花点时间更好地了解一下时间序列和监督学习的数据集结构。 -A time series is a sequence of numbers that are ordered by a time index. This can be thought of as a list or column of ordered values. +单个时间序列由一系列按照时间排序的数字序列组成。可以将其理解为一列有序值。 -For example: +例如: ``` 0 @@ -50,9 +50,9 @@ For example: ``` -A supervised learning problem is comprised of input patterns (*X*) and output patterns (*y*), such that an algorithm can learn how to predict the output patterns from the input patterns. +而一个监督学习问题是由一组输入(*X*)和一组输出(*y*)组成,算法可以学会如何通过输入值来预测输出值。 -For example: +例如: ``` X, y @@ -66,21 +66,21 @@ X, y 8, 9 ``` -For more on this topic, see the post: +可以参阅这篇文章,学习更多有关知识: - [Time Series Forecasting as Supervised Learning](http://machinelearningmastery.com/time-series-forecasting-supervised-learning/) -## Pandas shift() Function +## Pandas 的 shift() 函数 -A key function to help transform time series data into a supervised learning problem is the Pandas [shift()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.shift.html) function. +我们将时间序列数据转化为监督学习问题的关键就是使用 Pandas 的 [shift()](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.shift.html) 函数。 -Given a DataFrame, the *shift()* function can be used to create copies of columns that are pushed forward (rows of NaN values added to the front) or pulled back (rows of NaN values added to the end). +给定一个 DataFrame,*shift()* 函数会将输入的列复制一份,然后将副本列整体往后移动(最前面的数据空位会用 NaN 填充)或者往前移动(最后面的数据空位会用 NaN 填充)。 -This is the behavior required to create columns of lag observations as well as columns of forecast observations for a time series dataset in a supervised learning format. +这样可以创建一个滞后值列,加上观察列,就能将时间序列数据集变成监督学习数据集的格式。 -Let’s look at some examples of the shift function in action. +让我们看看 shift 函数实际用起来效果如何。 -We can define a mock time series dataset as a sequence of 10 numbers, in this case a single column in a DataFrame as follows: +我们可以通过下面的代码模拟一个长度为 10 的时间序列数据集,此时它在 DataFrame 中为单独的一列: ``` from pandas import DataFrame @@ -89,7 +89,7 @@ df['t'] = [x for x in range(10)] print(df) ``` -Running the example prints the time series data with the row indices for each observation. +运行上面的样例,将时间序列数据输出,其每一行都为带有索引的观察组数据。 ``` t 0 0 @@ -104,9 +104,9 @@ Running the example prints the time series data with the row indices for each ob 9 9 ``` -We can shift all the observations down by one time step by inserting one new row at the top. Because the new row has no data, we can use NaN to represent “no data”. +我们可以在数据顶部插入一行,将观察组的数据整体下挪一位。由于最上面插入的新行没有数据,因此我们可以用 NaN 填充来表示这儿“没有数据”。 -The shift function can do this for us and we can insert this shifted column next to our original series. +shift 函数可以完成这些操作。我们可以将 shift 函数“挪动”过的新列插入原始序列的旁边。 ``` from pandas import DataFrame @@ -116,9 +116,9 @@ df['t-1'] = df['t'].shift(1) print(df) ``` -Running the example gives us two columns in the dataset. The first with the original observations and a new shifted column. +运行上面的样例,你将得到一个包含两列的数据集。第一列是原始的观察组,第二列是经由 shift 函数挪动生成的新列。 -We can see that shifting the series forward one time step gives us a primitive supervised learning problem, although with *X* and *y* in the wrong order. Ignore the column of row labels. The first row would have to be discarded because of the NaN value. The second row shows the input value of 0.0 in the second column (input or *X*) and the value of 1 in the first column (output or *y*). +可以看到,经过将序列移动一次的操作之后,我们得到了一个原始的监督学习问题(虽然此时的 *X* 和 *y* 的排序明显是错的)。忽略最前面的表头,第一行存在 NaN 值,因此需要将其丢弃。在第二行,我们可以将第二列的 0.0 作为输入值(也就是 *X*),将第一列的 1 作为输出值(或 *y*)。 ``` t t-1 @@ -134,9 +134,9 @@ We can see that shifting the series forward one time step gives us a primitive s 9 9 8.0 ``` -We can see that if we can repeat this process with shifts of 2, 3, and more, how we could create long input sequences (*X*) that can be used to forecast an output value (*y*). +如果我们重复 shift 步骤,让原始列挪动 2 位、3 位或者更多位,我们就能得到一系列的输入数据(*X*),由这些输入值就能去预测输出值(*y*)了。 -The shift operator can also accept a negative integer value. This has the effect of pulling the observations up by inserting new rows at the end. Below is an example: +shift 操作能也能接受负整数作为参数。如果你这么做,它会在列底部插入新行,从而使得原列向上移动。下面是例子: ``` from pandas import DataFrame @@ -146,9 +146,9 @@ df['t+1'] = df['t'].shift(-1) print(df) ``` -Running the example shows a new column with a NaN value as the last value. +运行上面的样例,可以看到新列中的最后一个值为 NaN。 -We can see that the forecast column can be taken as an input (*X*) and the second as an output value (*y*). That is the input value of 0 can be used to forecast the output value of 1. +此时可以将预测列作为输入值(*X*),将第二列作为输出值(*y*)。也就是给定输入值 0 可以用于预测输出值 1。 ``` t t+1 @@ -164,42 +164,42 @@ We can see that the forecast column can be taken as an input (*X*) and the secon 9 9 NaN ``` -Technically, in time series forecasting terminology the current time (*t*) and future times (*t+1*, *t+n*) are forecast times and past observations (*t-1*, *t-n*) are used to make forecasts. +从技术上说,在时间序列预测问题的术语中,当前时间(*t*)和未来时间(*t+1, t+n*)为待预测时间,过去时间(*t-1, t-n*)则用于预测。 -We can see how positive and negative shifts can be used to create a new DataFrame from a time series with sequences of input and output patterns for a supervised learning problem. +从上面的例子中,我们可以学会如何使用通过 shift 函数正向或反向移动序列,生成新的 DataFrame,将时间序列问题转变成监督学习问题的输入-输出模式。 -This permits not only classical *X -> y* prediction, but also *X -> Y* where both input and output can be sequences. +这不仅可以解决经典的 *X -> y* 类预测问题,也可以用于输入输出值都是序列的 *X -> Y* 类预测。 -Further, the shift function also works on so-called multivariate time series problems. That is where instead of having one set of observations for a time series, we have multiple (e.g. temperature and pressure). All variates in the time series can be shifted forward or backward to create multivariate input and output sequences. We will explore this more later in the tutorial. +另外,shift 函数也能用于多元时间序列问题中。这类问题中包含多列观察组(例如温度、气压等)。时间序列中的所有变量都能用通过向前或向后挪动,生成多元输入值与输出值序列。稍后我们将探讨这类问题。 -## The series_to_supervised() Function +## series_to_supervised() 函数 -We can use the *shift()* function in Pandas to automatically create new framings of time series problems given the desired length of input and output sequences. +我们可以使用 Pandas 的 *shift()* 函数,在给定希望得到的输入值、输出值序列长度后自动生成时间序列问题的新格式数据。 -This would be a useful tool as it would allow us to explore different framings of a time series problem with machine learning algorithms to see which might result in better performing models. +这是个很有用的工具。我们可以通过机器学习算法研究各种时间序列问题格式,探究哪种格式能够得到效果更佳的模型。 -In this section, we will define a new Python function named *series_to_supervised()* that takes a univariate or multivariate time series and frames it as a supervised learning dataset. +在本节中,我们将创建一个新的 Python 函数,名为 *series_to_supervised()*。它可以将多元时间序列问题与一元时间序列问题转换为监督学习数据集的格式。 -The function takes four arguments: +这个函数接收以下 4 个参数: -- **data**: Sequence of observations as a list or 2D NumPy array. Required. -- **n_in**: Number of lag observations as input (*X*). Values may be between [1..len(data)] Optional. Defaults to 1. -- **n_out**: Number of observations as output (*y*). Values may be between [0..len(data)-1]. Optional. Defaults to 1. -- **dropnan**: Boolean whether or not to drop rows with NaN values. Optional. Defaults to True. +- **data**:必填,待转换的序列,数据类型为 list 或 2 维 NumPy array。 +- **n_in**: 可选,滞后组(作为输入值 X)的数量。范围可以在 [1..len(data)] 之间,默认值为 1。 +- **n_out**: 可选,观察组(作为输出值 y)的数量。范围可以在 [0..len(data)-1] 之间,默认值为 1。 +- **dropnan**:选填,决定是否抛去包含 NaN 的行。类型为 Boolean,默认值为 True。 -The function returns a single value: +函数将会返回一个值: -- **return**: Pandas DataFrame of series framed for supervised learning. +- **return**:返回监督学习格式的数据集,数据类型为 Pandas DataFrame。 -The new dataset is constructed as a DataFrame, with each column suitably named both by variable number and time step. This allows you to design a variety of different time step sequence type forecasting problems from a given univariate or multivariate time series. +新数据集 DataFrame 格式,每一列都由原变量名称和移动步数命名,让你可以根据给定的一元或多元时间序列问题设计出各种移动步数的序列。 -Once the DataFrame is returned, you can decide how to split the rows of the returned DataFrame into *X* and *y* components for supervised learning any way you wish. +在 DataFrame 返回时,你可以对其行进行分割,根据你的需要决定如何将返回的 DataFrame 分成 X 和 y 两部分。 -The function is defined with default parameters so that if you call it with just your data, it will construct a DataFrame with *t-1* as *X* and *t* as *y*. +这个函数的参数都设置了默认值,因此可以直接调用它处理你的数据,这种默认情况它将会返回一个 *t-1* 作为 X,*t* 作为 y 的 DataFrame。 -The function is confirmed to be compatible with Python 2 and Python 3. +这个函数已确定同时兼容 Python2 和 Python3。 -The complete function is listed below, including function comments. +下面为完整代码,并写好了注释: ``` from pandas import DataFrame @@ -207,51 +207,49 @@ from pandas import concat def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): """ - Frame a time series as a supervised learning dataset. - Arguments: - data: Sequence of observations as a list or NumPy array. - n_in: Number of lag observations as input (X). - n_out: Number of observations as output (y). - dropnan: Boolean whether or not to drop rows with NaN values. - Returns: - Pandas DataFrame of series framed for supervised learning. + 函数用途:将时间序列转化为监督学习数据集。 + 参数说明: + data: 观察值序列,数据类型可以是 list 或者 NumPy array。 + n_in: 作为输入值(X)的滞后组的数量。 + n_out: 作为输出值(y)的观察组的数量。 + dropnan: Boolean 值,确定是否将包含 NaN 的行移除。 + 返回值: + 经过转换的用于监督学习的 Pandas DataFrame 序列。 """ n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() - # input sequence (t-n, ... t-1) + # 输入序列 (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] - # forecast sequence (t, t+1, ... t+n) + # 预测序列 (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] - # put it all together + # 将所有列拼合 agg = concat(cols, axis=1) agg.columns = names - # drop rows with NaN values + # drop 掉包含 NaN 的行 if dropnan: agg.dropna(inplace=True) return agg ``` -Can you see obvious ways to make the function more robust or more readable? +你觉得可以怎样提高这个函数的鲁棒性或者可读性吗?请留言在评论区。 -Please let me know in the comments below. +至此我们已经得到了整个函数,接下来探索它的用法。 -Now that we have the whole function, we can explore how it may be used. +## 单步或单变量预测 -## One-Step Univariate Forecasting +在时间序列预测问题中通常使用滞后时间(例如 t-1)作为输入变量来预测当前时间(t)。 -It is standard practice in time series forecasting to use lagged observations (e.g. t-1) as input variables to forecast the current time step (t). +这种问题被称为单步预测。 -This is called one-step forecasting. - -The example below demonstrates a one lag time step (t-1) to predict the current time step (t). +下面展示了使用滞后一个时间步的时间(t-1)来预测当前时间(t)的例子。 ``` from pandas import DataFrame @@ -259,39 +257,43 @@ from pandas import concat def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): """ - Frame a time series as a supervised learning dataset. - Arguments: - data: Sequence of observations as a list or NumPy array. - n_in: Number of lag observations as input (X). - n_out: Number of observations as output (y). - dropnan: Boolean whether or not to drop rows with NaN values. - Returns: - Pandas DataFrame of series framed for supervised learning. + 函数用途:将时间序列转化为监督学习数据集。 + 参数说明: + data: 观察值序列,数据类型可以是 list 或者 NumPy array。 + n_in: 作为输入值(X)的滞后组的数量。 + n_out: 作为输出值(y)的观察组的数量。 + dropnan: Boolean 值,确定是否将包含 NaN 的行移除。 + 返回值: + 经过转换的用于监督学习的 Pandas DataFrame 序列。 """ n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() - # input sequence (t-n, ... t-1) + # 输入序列 (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] - # forecast sequence (t, t+1, ... t+n) + # 预测序列 (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] - # put it all together + # 将所有列拼合 agg = concat(cols, axis=1) agg.columns = names - # drop rows with NaN values + # drop 掉包含 NaN 的行 if dropnan: agg.dropna(inplace=True) return agg + + values = [x for x in range(10)] + data = series_to_supervised(values) + print(data) ``` -Running the example prints the output of the reframed time series. +运行样例,输出转换后的时间序列。 ``` var1(t-1) var1(t) @@ -306,51 +308,51 @@ Running the example prints the output of the reframed time series. 9 8.0 9 ``` -We can see that the observations are named “*var1*” and that the input observation is suitably named (t-1) and the output time step is named (t). +可以看到,观察组被命名为“*var1*”,作为输入值的观察组被命名为(*t-1*),输出值组被命名为(*t*)。 -We can also see that rows with NaN values have been automatically removed from the DataFrame. +此外,可以看到包含 NaN 的行已经被自动从 DataFrame 中移除。 -We can repeat this example with an arbitrary number length input sequence, such as 3. This can be done by specifying the length of the input sequence as an argument; for example: +我们可以任意给定输入序列数量的值来重复运行这个例子。例如输入 3,我们事先已经将输入序列的数量定义为了一个参数。例如: ``` data = series_to_supervised(values, 3) ``` -The complete example is listed below. +完整样例如下: ``` from pandas import DataFrame from pandas import concat def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): - """ - Frame a time series as a supervised learning dataset. - Arguments: - data: Sequence of observations as a list or NumPy array. - n_in: Number of lag observations as input (X). - n_out: Number of observations as output (y). - dropnan: Boolean whether or not to drop rows with NaN values. - Returns: - Pandas DataFrame of series framed for supervised learning. +""" + 函数用途:将时间序列转化为监督学习数据集。 + 参数说明: + data: 观察值序列,数据类型可以是 list 或者 NumPy array。 + n_in: 作为输入值(X)的滞后组的数量。 + n_out: 作为输出值(y)的观察组的数量。 + dropnan: Boolean 值,确定是否将包含 NaN 的行移除。 + 返回值: + 经过转换的用于监督学习的 Pandas DataFrame 序列。 """ n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() - # input sequence (t-n, ... t-1) + # 输入序列 (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] - # forecast sequence (t, t+1, ... t+n) + # 预测序列 (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] - # put it all together + # 将所有列拼合 agg = concat(cols, axis=1) agg.columns = names - # drop rows with NaN values + # drop 掉包含 NaN 的行 if dropnan: agg.dropna(inplace=True) return agg @@ -361,7 +363,7 @@ data = series_to_supervised(values, 3) print(data) ``` -Again, running the example prints the reframed series. We can see that the input sequence is in the correct left-to-right order with the output variable to be predicted on the far right. +再次运行样例,输出重新构造的序列,可以看到输入序列准确无误地从左至右裴烈,作为预测项的输入值在最右边。 ``` var1(t-3) var1(t-2) var1(t-1) var1(t) @@ -374,19 +376,19 @@ Again, running the example prints the reframed series. We can see that the input 9 6.0 7.0 8.0 9 ``` -## Multi-Step or Sequence Forecasting +## 多步或序列预测 -A different type of forecasting problem is using past observations to forecast a sequence of future observations. +还有一类预测问题:使用过去的观察组来对未来的观察组序列做预测。 -This may be called sequence forecasting or multi-step forecasting. +可以将这类问题成为序列预测问题或者多步预测问题。 -We can frame a time series for sequence forecasting by specifying another argument. For example, we could frame a forecast problem with an input sequence of 2 past observations to forecast 2 future observations as follows: +我们可以通过规定另一个参数来将序列预测问题的时间序列重新构造。例如,我们可以把 2 个过去的观察组转变为 2 个未来的观察组,从而重新构造预测问题: ``` data=series_to_supervised(values,2,2) ``` -The complete example is listed below: +完整样例如下: ``` from pandas import DataFrame @@ -394,33 +396,33 @@ from pandas import concat def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): """ - Frame a time series as a supervised learning dataset. - Arguments: - data: Sequence of observations as a list or NumPy array. - n_in: Number of lag observations as input (X). - n_out: Number of observations as output (y). - dropnan: Boolean whether or not to drop rows with NaN values. - Returns: - Pandas DataFrame of series framed for supervised learning. + 函数用途:将时间序列转化为监督学习数据集。 + 参数说明: + data: 观察值序列,数据类型可以是 list 或者 NumPy array。 + n_in: 作为输入值(X)的滞后组的数量。 + n_out: 作为输出值(y)的观察组的数量。 + dropnan: Boolean 值,确定是否将包含 NaN 的行移除。 + 返回值: + 经过转换的用于监督学习的 Pandas DataFrame 序列。 """ n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() - # input sequence (t-n, ... t-1) + # 输入序列 (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] - # forecast sequence (t, t+1, ... t+n) + # 预测序列 (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] - # put it all together + # 将所有列拼合 agg = concat(cols, axis=1) agg.columns = names - # drop rows with NaN values + # drop 掉包含 NaN 的行 if dropnan: agg.dropna(inplace=True) return agg @@ -430,7 +432,7 @@ data = series_to_supervised(values, 2, 2) print(data) ``` -Running the example shows the differentiation of input (t-n) and output (t+n) variables with the current observation (t) considered an output. +运行样例,可以看到将(*t-n*)作为输入变量、将(*t+n*)作为输出变量时,与将当前观察组(*t*)作为输出的不同之处。 ``` var1(t-2) var1(t-1) var1(t) var1(t+1) @@ -444,17 +446,15 @@ Running the example shows the differentiation of input (t-n) and output (t+n) va ``` -## Multivariate Forecasting - -Another important type of time series is called multivariate time series. +## 多元预测 -This is where we may have observations of multiple different measures and an interest in forecasting one or more of them. +还有一种重要的时间序列类型,叫做多元时间序列。 -For example, we may have two sets of time series observations obs1 and obs2 and we wish to forecast one or both of these. +这种情况我们会将多个不同的指标作为观察组,并预测它们中的一个或多个的值。 -We can call *series_to_supervised()* in exactly the same way. +例如,我们有两组时间序列观察组 obs1 和 obs2,希望预测它们或它们中的一者。 -For example: +我们同样可以调用 *series_to_supervised()*。例如: ``` from pandas import DataFrame @@ -462,33 +462,33 @@ from pandas import concat def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): """ - Frame a time series as a supervised learning dataset. - Arguments: - data: Sequence of observations as a list or NumPy array. - n_in: Number of lag observations as input (X). - n_out: Number of observations as output (y). - dropnan: Boolean whether or not to drop rows with NaN values. - Returns: - Pandas DataFrame of series framed for supervised learning. + 函数用途:将时间序列转化为监督学习数据集。 + 参数说明: + data: 观察值序列,数据类型可以是 list 或者 NumPy array。 + n_in: 作为输入值(X)的滞后组的数量。 + n_out: 作为输出值(y)的观察组的数量。 + dropnan: Boolean 值,确定是否将包含 NaN 的行移除。 + 返回值: + 经过转换的用于监督学习的 Pandas DataFrame 序列。 """ n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() - # input sequence (t-n, ... t-1) + # 输入序列 (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] - # forecast sequence (t, t+1, ... t+n) + # 预测序列 (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] - # put it all together + # 将所有列拼合 agg = concat(cols, axis=1) agg.columns = names - # drop rows with NaN values + # drop 掉包含 NaN 的行 if dropnan: agg.dropna(inplace=True) return agg @@ -502,9 +502,9 @@ data = series_to_supervised(values) print(data) ``` -Running the example prints the new framing of the data, showing an input pattern with one time step for both variables and an output pattern of one time step for both variables. +运行样例,将会得到经过重新构造后的数据。数据显示了分别处于同一个时间的两组变量作为输入组以及输出组。 -Again, depending on the specifics of the problem, the division of columns into *X* and *Y* components can be chosen arbitrarily, such as if the current observation of *var1* was also provided as input and only *var2* was to be predicted. +与之前一样,根据问题的需要,可以将列分入 *X* 和 *y* 两个子集中,需要注意的是如果放入了 *var1* 做为观察组,那就要放入 *var2* 作为待预测组。 ``` var1(t-1) var2(t-1) var1(t) var2(t) @@ -519,9 +519,9 @@ Again, depending on the specifics of the problem, the division of columns into * 9 8.0 58.0 9 59 ``` -You can see how this may be easily used for sequence forecasting with multivariate time series by specifying the length of the input and output sequences as above. +可以看到,通过上面这样给定输入序列和输出序列的数量生成的新的序列,可以帮助你轻松地完成多元时间序列的预测。 -For example, below is an example of a reframing with 1 time step as input and 2 time steps as forecast sequence. +例如,下面将把 1 作为输入列数量,将 2 作为输出列(预测列)数量,重新构造预测序列: ``` from pandas import DataFrame @@ -529,33 +529,33 @@ from pandas import concat def series_to_supervised(data, n_in=1, n_out=1, dropnan=True): """ - Frame a time series as a supervised learning dataset. - Arguments: - data: Sequence of observations as a list or NumPy array. - n_in: Number of lag observations as input (X). - n_out: Number of observations as output (y). - dropnan: Boolean whether or not to drop rows with NaN values. - Returns: - Pandas DataFrame of series framed for supervised learning. + 函数用途:将时间序列转化为监督学习数据集。 + 参数说明: + data: 观察值序列,数据类型可以是 list 或者 NumPy array。 + n_in: 作为输入值(X)的滞后组的数量。 + n_out: 作为输出值(y)的观察组的数量。 + dropnan: Boolean 值,确定是否将包含 NaN 的行移除。 + 返回值: + 经过转换的用于监督学习的 Pandas DataFrame 序列。 """ n_vars = 1 if type(data) is list else data.shape[1] df = DataFrame(data) cols, names = list(), list() - # input sequence (t-n, ... t-1) + # 输入序列 (t-n, ... t-1) for i in range(n_in, 0, -1): cols.append(df.shift(i)) names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)] - # forecast sequence (t, t+1, ... t+n) + # 预测序列 (t, t+1, ... t+n) for i in range(0, n_out): cols.append(df.shift(-i)) if i == 0: names += [('var%d(t)' % (j+1)) for j in range(n_vars)] else: names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)] - # put it all together + # 将所有列拼合 agg = concat(cols, axis=1) agg.columns = names - # drop rows with NaN values + # drop 掉包含 NaN 的行 if dropnan: agg.dropna(inplace=True) return agg @@ -566,10 +566,9 @@ raw['ob2'] = [x for x in range(50, 60)] values = raw.values data = series_to_supervised(values, 1, 2) print(data) - ``` -Running the example shows the large reframed DataFrame. +运行样例,将会展示重新构造的很大的 DataFrame。 ``` var1(t-1) var2(t-1) var1(t) var2(t) var1(t+1) var2(t+1) @@ -583,18 +582,17 @@ Running the example shows the large reframed DataFrame. 8 7.0 57.0 8 58 9.0 59.0 ``` -Experiment with your own dataset and try multiple different framings to see what works best. - -## Summary +你可以用你自己的数据集多做几次实验,来试试哪种重构的效果更好。 -In this tutorial, you discovered how to reframe time series datasets as supervised learning problems with Python. +## 总结 -Specifically, you learned: +在这篇教程中,你已经了解了如何使用 Python 将时间序列数据集转换为监督学习问题。 -- About the Pandas *shift()* function and how it can be used to automatically define supervised learning datasets from time series data. -- How to reframe a univariate time series into one-step and multi-step supervised learning problems. -- How to reframe multivariate time series into one-step and multi-step supervised learning problems. +特别的,你了解了: +- 有关 Pandas *shift()* 函数的知识,以及它如何自动将时间序列数据转化为监督学习数据集。 +- 如何将一元时间序列重构成单步或多步监督学习问题。 +- 如何将多元时间序列重构成单步或多步监督学习问题。 --- From 5d21ee49f4be7023fa15293a9a4301c2526a67ee Mon Sep 17 00:00:00 2001 From: lsvih Date: Wed, 9 Aug 2017 16:10:43 +0800 Subject: [PATCH 10/15] =?UTF-8?q?=E6=B7=B1=E5=BA=A6=E5=AD=A6=E4=B9=A0?= =?UTF-8?q?=E7=B3=BB=E5=88=972=EF=BC=9A=E5=8D=B7=E7=A7=AF=E7=A5=9E?= =?UTF-8?q?=E7=BB=8F=E7=BD=91=E7=BB=9C=20(#2010)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...earning-2-convolutional-neural-networks.md | 66 +++++++++---------- 1 file changed, 33 insertions(+), 33 deletions(-) diff --git a/TODO/deep-learning-2-convolutional-neural-networks.md b/TODO/deep-learning-2-convolutional-neural-networks.md index 9ebe6bc4ff1..91fe4da4a81 100644 --- a/TODO/deep-learning-2-convolutional-neural-networks.md +++ b/TODO/deep-learning-2-convolutional-neural-networks.md @@ -3,78 +3,78 @@ > * 原文作者:[Rutger Ruizendaal](https://medium.com/@r.ruizendaal) > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-2-convolutional-neural-networks.md](https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-2-convolutional-neural-networks.md) -> * 译者: -> * 校对者: +> * 译者:[lsvih](https://github.com/lsvih) +> * 校对者:[edvardHua](https://github.com/edvardHua),[lileizhenshuai](https://github.com/lileizhenshuai) -# Deep Learning 2: Convolutional Neural Networks +# 深度学习系列2:卷积神经网络 -## How and what do CNNs learn? +## CNN 是怎么学习的?学习了什么? -*This post is part of a series on deep learning. Check-out part 1 *[*here*](https://medium.com/towards-data-science/deep-learning-1-1a7e7d9e3c07)* and part 3 *[*here*](https://medium.com/@r.ruizendaal/deep-learning-3-more-on-cnns-handling-overfitting-2bd5d99abe5d)*.* +**这篇文章是深度学习系列的一部分。你可以在**[**这里**](https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-1-setting-up-aws-image-recognition.md)**查看第一部分,以及在**[**这里**](https://github.com/xitu/gold-miner/blob/master/TODO/deep-learning-3-more-on-cnns-handling-overfitting.md)**查看第三部分。** ![](https://cdn-images-1.medium.com/max/1600/1*z7hd8FZeI_eodazwIapvAw.png) -This week we will explore the inner workings of a Convolutional Neural Network (CNN). You might be wondering what happens inside these networks? And how do they learn? +这一周,我们将探索卷积神经网络(CNN)的内部工作原理。你可能会问:在网络内部究竟发生了什么?它们是怎样学习的? -The teaching philosophy behind the course I’m following is based on a top-down approach. Basically, we immediately get to play with the full models and as we go along we learn more and more about its inner workings. Therefore, these blog posts will gradually dive deeper into the inner workings of neural networks. This is only week 2 so we are starting to make steps towards that goal. +这门课程遵循自上而下的学习方法与理念。因此一般来说,我们在开始学习的时候就能立即玩到所有的模型,然后我们会逐渐深入其内部的工作原理。因此,本系列也将会逐渐深入探索神经网络的内部工作原理。现在仅仅是第二周,让我们朝着最终的目标迈进吧! -Last week I trained the Vgg16 model on a dataset of cat and dog images. I want to first start by addressing why using a pre-trained model is a good approach. In order to do this it is important to think about what these models are learning. In essence a CNN is learning filters and applying them to the images. These are not the same filters you apply to your Instagram selfies but the concept is not that different. The CNN takes a small square and starts applying it over the image, this square is often referred to as a ‘window’. The network than looks for parts of the image where this filter matches the contents of the image. In the first layer the network might learn simple things like diagonal lines. In each layer the network is able to combine these findings and continually learn more complex concepts. This all still sounds pretty vague, so let’s look at some examples. [Zeiler and Fergus (2013)](https://arxiv.org/abs/1311.2901) did a great job of visualizing what a CNN learns. This is the CNN they used in their paper. The Vgg16 model that won the Imagenet competition is based on this model. +在上周,我在猫狗图像集上训练了 Vgg 16 模型。我想先聊一下为什么说使用预先训练好的模型是一种很好的方法。为了使用这些模型,首先你得要弄清楚这些模型到底学习的是什么。从本质上说,CNN 学习的是过滤器,并将学习到的过滤器应用于图像。当然,这些“过滤器”和你在 Instagram 里用的滤镜(英文也为“filter”)并不是一种东西,但它们其实有一些相同之处。CNN 会使用一个小方块遍历整张图片,通常将这个小方块称为“窗口”。接下来,网络会在图片中查找与过滤器匹配的图片内容。在第一层,网络可能只学习到了一些简单的事物(例如对角线)。在之后的每一层中,网络都将结合前面找到的特征,持续学习更加复杂的概念。单单听这些概念可能会让人比较迷糊,让我们直接来看一些例子。[Zeiler and Fergus (2013)](https://arxiv.org/abs/1311.2901) 为可视化 CNN 学习过程做出了一项很棒的工作。下图是他们在论文中用的 CNN 模型,赢得 Imagenet 竞赛的 Vgg16 模型就是基于这个模型做出来的。 ![](https://cdn-images-1.medium.com/max/1600/1*vKyUGyRnJnZ3XOVVlvp80g.png) -CNN by Zeiler & Fergus (2013) -This image might look very confusing to you right now, don’t panic! Let’s start with some things that we can all see from this picture. First, the input images are square and 224x224 pixels. The filters that I talked about earlier are 7x7 pixels. The model has an input layer, 7 hidden layers and an output layer. C in the output layer refers to the number of classes the model will predict for. Now let’s go to the most interesting stuff: What the model learns in different layers! +CNN,作者:Zeiler & Fergus (2013) + +可能你现在会觉得这个图片很难懂,请不要慌!让我们先从我们可以在图中看到的东西说起吧。首先,输入图像是正方形,大小为 224x224 像素。我之前说的过滤器大小是 7x7 像素大小。该模型有一个输入层,7 个隐藏层以及一个输出层。输出层的“C”指的是模型的预测分类数量。现在让我们来了解 CNN 中最有趣的部分:这个神经网络在每一层中都学到了什么! ![](https://cdn-images-1.medium.com/max/1600/1*k57FsdDndnfb4FendDdnAw.png) -Layer 2 of the CNN. The left image represents what the CNN has learned and the right image has parts of actual images. -In layer 2 of the CNN the model is already picking up more interesting shapes than just diagonal lines. In the sixth square (counting horizontally) you can see that the model is picking up circular shapes. Also, the last square is looking at corners. +上图为 CNN 的第二层。左边的图像代表了 CNN 的这层网络在右边的真实图片中学习到的内容。 +在 CNN 的第二层中,你可以发现这个模型已经不仅仅是去提取对角线了,它找到了一些更有意思的形状特征。例如在第二排第二列的方块中,你可以看到模型正在提取圆形;还有,最后一个方块表明模型正在专注于识别图中的一个直角作为特征。 ![](https://cdn-images-1.medium.com/max/1600/1*7J5H2D0WSRBnEvI-BXfONg.png) -Layer 3 of the CNN -In layer 3 we can see that the model is starting to learn more specific things. The first square shows that the model is now able to recognize geographical patterns. The sixth square is recognizing car tires. And the eleventh square is recognizing people. +上图为 CNN 的第三层。 +在第三层中,我们可以看到模型已经开始学习一些更具体的东西。第一个方块中的图像表明模型已经能够识别出一些地理特征;第二排第二列的方块表明模型正在识别车轮;倒数第二个方块表明模型正在识别人类。 ![](https://cdn-images-1.medium.com/max/2000/1*QKxqFAp83WDU94N0a7AIpg.png) -Layers 4 and 5 of the CNN +CNN 的第四层与第五层 -Finally, layers 4 and 5 continue this trend. Layer 5 is picking up tings that are very useful for our dogs and cats problem. It is also recognizing unicycles and bird/reptile eyes. Be aware that these images only show a very small fraction of things learned by each layer. +在最后,第四层与第五层保持前面模型越来越具体的趋势。第五层找到了对解决我们的猫狗问题非常有帮助的特征。与此同时,它还识别出了独轮车,以及鸟类、爬行动物的眼睛。请注意,这些图像仅仅展示了每一层学习到的东西的极小一部分。 -Hopefully this shows you why using pre-trained models is useful. You can look up ‘transfer learning’ if you want to learn more about this research area. The Vgg16 model already knows a lot about recognizing dogs and cats. The training set for our cats and dogs problem has only 25.000 images. A new model might not be able to learn all these features from those images. Through a process called finetuning we can change the last layer of the Vgg16 model so that it does not output probabilities for a 1000 classes but only for 2, cats and dogs. +希望上面的文字已经告诉了你为什么使用预先训练好的模型是很有用的。如果你想更多的了解这块领域的研究,你可以搜索“迁移学习”(transfer learning)的相关内容。虽然我们的猫狗问题训练集仅仅只有 25000 张图片,一个新的模型可能还无法从这些图片中学习到所有的特征,但我们的 Vgg16 模型已经相当“了解”怎么去识别猫和狗了。最后,通过“微调”(Finetuning) Vgg16 模型的最后一层,让其不再输出 1000 多种分类的概率,而是直接输出二分类 —— 猫和狗。 -If you are interested in reading more about the math behind deep learning, [Stanford’s CNN pages](http://cs231n.github.io/) provide a great resource. They also refer to shallow Neural Networks as “mathematically cute”, that’s a first. +如果你对深度学习背后的数学知识感兴趣,[Stanford’s CNN pages](http://cs231n.github.io/) 是很好的参考材料。他们首次以“数学之美”解释了浅层神经网络。 --- -#### Finetuning and Linear Layers +#### 微调及线性层(全连接层) -The pre-trained Vgg16 model that I used last week to classify cats and dogs does not naturally output these two categories. It actually puts out 1000 classes. Additionally, the model does not even output the classes ‘cats and dogs’ but it outputs specific breeds of cats and dogs. So how can we change this model efficiently to only classify the images as cats or dogs? +上周,我用这个预先训练好的 Vgg16 模型不能很自然的区分猫和狗这两个分类下的图片,而是提出了 1000 余种分类。此外,这个模型并不会直接输出“猫”和“狗”的分类,而是输出猫和狗的一些特定品种。那我们如何修改这个模型,让它能够有效地对猫和狗进行分类呢? -One option would be to manually map these breeds to cats and dogs and sum the probabilities. However, this method ignores some critical information. For example, if there is a bone in the picture that image is probably of a dog. But if we only look at the probabilities per breed this information would be lost. There we replace the linear (dense) layer at the end of the model and replace it with one that only outputs 2 classes. The Vgg16 model actually has 3 linear layers at the end. We can finetune all these layers and train them through backpropagation. Backpropagation is often seen as some kind of abstract magic, but it simply is calculating gradients using the chain rule. You’ll never have to worry about the details of the math. TensorFlow, Theano and other deep learning libraries will do that for you. +有种可选方案:手动将这些品种分到猫和狗中去,然后计算其概率之和。但是,这种做法会丢弃一些关键信息。例如,如果图片中只有一根骨头,但它很可能是一张属于狗的照片。如果我们仅查看这些品种分为猫狗的概率,前面提到的这种信息很可能会丢失。因此在模型的最后,我们加入一个线性层(全连接层),它将仅输出两种分类。实际上,Vgg16 模型的最后有 3 层全连接层。我们可以微调这些层,通过反向传播来训练它们。反向传播算法常常被人看成是一种抽象的魔法,但其实它只是简单应用链式求导法则。你可以暂时忽略这些数学上的细节,TensorFlow、Theano 和其它深度学习库已经帮你做好了这些工作。 -If you are going through the notebook for lesson 2 of the Fast AI course be aware of memory issues. I recommend that you first run the notebook using only the sample images. If you are using a p2 instance you can otherwise run out of memory if you keep saving and loading the numpy arrays. +如果你正在运行 Fast AI 课程 lesson 2 的 notebook,我建议你最好先只使用 notebook 的样例图片。如果你运行 p2 的实例,可能会由于保存、加载 numpy 数组将内存耗尽。 --- -#### Activation Functions +#### 激活函数 -We just discussed the linear layer at the end of the network. However, all layers in a Neural Network are not linear. After calculating the values for each of the neurons in the Neural Network we put these values through an activation function. An Artificial Neural Network basically consists of matrix multiplications. If we would only use linear calculations we could just stack these on top of each other. That would not be a very deep network … Therefore, we often use non-linear activation functions at each layer in the network. By stacking layers of linear and non-linear functions on top each other we can theoretically model anything. These are the three most popular non-linear activation functions: +前面我们讨论了网络最后的线性层(全连接层)。然而,神经网络的所有层都不是线性的。在神经网络计算出每个神经元的参数之后,我们需要将它们的计算结果作为参数输入到激活函数中。人工神经网络基本上由矩阵乘法组成,如果我们只使用线性计算的话,我们只能将它们一个个叠加在一起,并不能做成一个很深的网络。因此,我们会经常在网络的各层使用非线性的激活函数。通过将重重线性与非线性函数叠加在一起,理论上我们可以对任何事物进行建模。下面是三种最受欢迎的非线性激活函数: -- Sigmoid *(parses a value to be between 0 and 1)* -- TanH *(parses a value to be between -1 and 1)* -- ReLu *(If the value is negative it becomes 0, otherwise it stays the same)* +- Sigmoid **(将值转换到 0,1 间)** +- TanH **(将值转换到 -1,1 间)** +- ReLu **(如果值为负则输出 0,否则输出原值)** ![](https://cdn-images-1.medium.com/max/1600/1*feheZP3rz5va0QVpi9DVNg.png) -Three most used activation functions: Sigmoid, Tanh & Rectified Linear Unit (ReLu) -Currently, the ReLu is by far the most used non-linear activation function. The main reasons for this are that it reduces the likelihood of a vanishing gradient and sparsity. We will discuss these reasons in more detail later. The last layer of the model generally uses a different activation function, because we want this layer to have a certain output. The softmax function is very popular when doing classification. - -After finetuning the last layers in the Vgg16 model the model has 138.357.544 parameters. Thankfully we do not have the calculate all the gradients by hand :). Next week I will dive further into the workings of a CNN and we will discuss underfitting and overfitting. +上图为最常用的激活函数:Sigmoid、Tanh 和 ReLu(又名修正线性单元) +目前,ReLu 是使用的最多的非线性激活函数,主要原因是它可以减少梯度消失的可能性,以及保持稀疏特征。稍后会讨论这方面的更多详情。因为我们希望模型最后能够输出确定的内容,因此模型的最后一层通常使用一种另外的激活函数 —— softmax。softmax 函数是一种非常受欢迎的分类器。 -If you liked this posts be sure to recommend it so others can see it. You can also follow this profile to keep up with my process in the Fast AI course. See you there! +在微调完 Vgg16 模型的最后一层之后,它总共有 138357544 个参数。谢天谢地,我们不需要手动计算各种梯度 XD。下一周我们将更深入地了解 CNN 的工作原理,讨论主题为欠拟合和过拟合。 +如果你喜欢这篇文章,请将它推荐给其他人吧!你也可以关注此系列文章,跟上 Fast AI 课程的进度。下篇文章再会! --- -> [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)。 +> [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#%E5%89%8D%E7%AB%AF)、[后端](https://github.com/xitu/gold-miner#%E5%90%8E%E7%AB%AF)、[产品](https://github.com/xitu/gold-miner#%E4%BA%A7%E5%93%81)、[设计](https://github.com/xitu/gold-miner#%E8%AE%BE%E8%AE%A1) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)。 From 7fe1aa711d9b7c0d4fac7d1ced50fd805242da7d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 16:20:23 +0800 Subject: [PATCH 11/15] :sparkles: Create binary-ast-newsletter-1.md --- TODO/binary-ast-newsletter-1.md | 303 ++++++++++++++++++++++++++++++++ 1 file changed, 303 insertions(+) create mode 100644 TODO/binary-ast-newsletter-1.md diff --git a/TODO/binary-ast-newsletter-1.md b/TODO/binary-ast-newsletter-1.md new file mode 100644 index 00000000000..97b9afcf447 --- /dev/null +++ b/TODO/binary-ast-newsletter-1.md @@ -0,0 +1,303 @@ + + > * 原文地址:[Towards a JavaScript Binary AST](https://yoric.github.io/post/binary-ast-newsletter-1/) + > * 原文作者:[Yoric](https://yoric.github.io/about/) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/binary-ast-newsletter-1.md](https://github.com/xitu/gold-miner/blob/master/TODO/binary-ast-newsletter-1.md) + > * 译者: + > * 校对者: + + # Towards a JavaScript Binary AST + + In this blog post, I would like to introduce the JavaScript Binary AST, an +ongoing project that we hope will help make webpages load faster, along with +a number of other benefits. + +# A little background + +Over the years, JavaScript has grown from one of the slowest scripting languages +available to a high-performance powerhouse, fast enough that it can run desktop, +server, mobile and even embedded applications, whether through web browsers or +other environments. + +As the power of JavaScript has grown, so has the complexity of applications +and their size. Whereas, twenty years ago, few websites used more than a few +Kb of JavaScript, many websites and non-web applications now need to deliver +and load several Mb of JavaScript before the user can start actually using +the site/app. + +While the sound of “several Mb of JavaScript” may sound odd, recall that a +native application such as Steam weighs 3.1Mb (pure binary, without resources, without +debugging symbols, without dynamic dependencies, measured on my Mac), +Telegram weights 11Mb and the Opera *updater* weighs 5.8Mb. I’m not adding the +size of a web browser, because web browsers are architected essentially +from dynamic dependencies, but I expect that both Firefox and Chromium +weigh 100+ Mb. + +Of course, large JavaScript source code has several costs, including: + +- heavy network transfers; +- slow startup. + +We have reached a stage at which the simple duration of parsing the JavaScript +source code of a large web application such as Facebook can easily last +500ms-800ms on a fast computer – that’s *before* the JavaScript code can be +compiled to bytecode and/or interpreted. +There is very little reason to believe that JavaScript +applications will get smaller with time. + +So, a joint team from Mozilla and Facebook decided to get started working on a novel mechanism +that we believe can dramatically improve the speed at which an application can +start executing its JavaScript: the Binary AST. + +# Introducing the Binary AST + +The idea of the JavaScript Binary AST is simple: instead of sending text source +code, what could we improve by sending *binary* source code? + +Let me clarify: the Binary AST source code is equivalent to the text +source code. It is *not* a new programming language, or a subset of +JavaScript, or a superset of JavaScript, it *is* JavaScript. +It is *not* a bytecode, rather a binary representation of the source +code. If you prefer, this Binary AST representation is a form of *source +compression*, designed specifically for JavaScript, and optimized to +improve parsing speed. We are also building a decoder that provides a perfectly +readable, well-formatted, source code. For the moment, the format does not +maintain comments, but there is a proposal to allow comments to be maintained. + +Producing a Binary AST file will require a build step and we hope that, in time, +build tools such as WebPack or Babel will be able to produce Binary AST files, +hence making switching to Binary AST as simple as passing a flag to the build +chains already used by many JS developers. + +I plan to detail the Binary AST, our benchmarks and our current status it in +future blog posts. For the moment, let me just mention that early experiments +suggest that we can both obtain very good source compression and considerable +parsing speedups. + +We have been working on Binary AST for a few months now and the project was +just accepted as a Stage 1 Proposal at at ECMA TC-39. This is encouraging, but +it will take time until you see implemented in all JavaScript VMs and toolchains. + +# Comparing with… + +## …compression formats + +Most webservers already send JavaScript data using a compression format such as +gzip or brotli. This considerably reduces the time spent waiting for the data. + +What we’re doing here is a format specifically designed for JavaScript. Indeed, +our early prototype uses gzip internally, among many other tricks, +and has two main advantages: + +- it is designed to make *parsing* much faster; +- according to early experiments, we beat gzip or brotli by a large margin. + +Note that our main objective is to make parsing faster, so in the future, if we +need to choose between file size and parsing speed, we are most likely to pick +faster parsing. Also, the compression formats used internally may change. + +## …minifiers + +The tool traditionally used by web developers to decrease the size of JS files +is the minifier, such as UglifyJS or Google’s Closure Compiler. + +Minifiers typically remove unused whitespace and comments, rewrite variable +names to shorten then, and use a number of other transformations to make the +program shorter. + +While these tools are definitely useful, they have two main shortcomings: + +- they do not attempt to make parsing faster – indeed, we have witnessed a +number of cases in which minification accidentally makes parsing slower; +- they have the side-effect of making the JavaScript code much harder to read, +including renaming unreadable names to variables and functions, +using exotic features to pack variable declarations, etc. + +By opposition, the Binary AST transformation: + +- is designed to make parsing faster; +- maintains the source code in such a manner that it can be easily decoded +and read, with all variable names, etc. + +Of course, obfuscation and Binary AST transformation can be combined for +applications that do not wish to keep the source code readable. + +## …WebAssembly + +Another exciting web technology designed to improve performance in certain +cases is WebAssembly (or wasm). wasm is designed to let native applications +be compiled in a format that can both be transferred efficiently, parsed +quickly and executed at native speed by the JavaScript VM. + +By design, however, wasm is limited to native code, so it doesn’t work +with JavaScript out of the box. + +I am not aware of any project that achieves compilation of JavaScript to wasm. +While this would certainly be feasible, this would be a rather risky undertaking, +as this would involve developing a compiler that is at least as complex as a +new JavaScript VM, while making sure that it is still compatible with +JavaScript (which is both a very tricky language and a language whose +specifications are clarified or extended at least once per year). Of course, +this task ends up useless if the resulting code is slower than today’s JavaScript +VMs (which tend to be really, really fast) or so large that it makes startup +prohibitively slow (because that’s the problem we are trying to solve here) +or if it doesn’t work with existing JavaScript libraries or (for browser +applications) the DOM. + +Now, exploring this would definitely be an interesting work, so if anybody +wants to prove us wrong, by all means, please do it :) + +## …improving caching + +When JavaScript code is downloaded by a browser, it is stored in the browser’s +cache, so as to avoid having to re-download it later. Both Chromium and Firefox +have recently improved their browsers to be able to cache not just the JavaScript +source code but also the bytecode, hence side-stepping nicely the issue of +parse time for the second load of a page. I have no idea of the status of +Safari or Edge on the topic, so it is possible that they may have comparable +technologies. + +Congratulation to both teams, these technologies are great! Indeed, they nicely +improve the performance of reloading a page. This works very well for pages that +have not updated their JavaScript code since the last time they were accessed. + +The problem we are attempting to solve with Binary AST is different: +while we all have some pages that we visit and revisit often, +there is a larger number of pages that we visit for the first time, +in addition to the pages that we revisit but that that have been updated since +our latest visit. In particular, a +growing number of applications get updated very, very often – for instance, Facebook +ships new JavaScript code several times per day, and I would be surprised if +Twitter, LinkedIn, Google Docs et al didn’t follow similar practices. Also, if +you are a JS +developer shipping a JavaScript application – whether web or otherwise – you +want the first contact between you and your users to be as smooth as possible, +which means that you want the first load (or first load since update) to be very +fast, too. + +These are problems that we address with Binary AST. + +# What if… + +## …we improved caching? + +Additional technologies have been discussed to let browsers prefetch and +precompile JS code to bytecode. + +These technologies are +definitely worth investigating and would also help with some of the scenarios +for which we are developing Binary AST – each technology improving the other. +In particular, the better resource-efficiency of Binary AST would thus help +limit the resource waste when such technologies are misused, while also +improving the cases in which these techniques cannot be used at all. + +## …we used an existing JS bytecode? + +Most, if not all, JavaScript Virtual Machines already use an internal +representation of code as JS bytecode. I seem to remember that at least +Microsoft’s Virtual Machine supports shipping JavaScript bytecode for +privileged application. + +So, one could imagine browser vendors exposing their bytecode and letting +all JS applications ship bytecode. This, however, sounds like a pretty bad +idea, for several reasons. + +The first one affects VM developers. Once you have exposed your internal +representation of JavaScript, you are doomed to maintain it. As it turns out, +JavaScript bytecode changes regularly, to adapt to new versions of the language +or to new optimizations. Forcing a VM to keep compatibility with an old version +of its bytecode forever would be a maintenance and/or performance disaster, so +I doubt that any browser/VM vendor will want to commit to this, except perhaps +in a very limited setting. + +The second affects JS developers. Having several bytecodes would mean +maintaining and shipping several binaries – possibly several dozens if you want to +fine-time optimizations to successive versions of each browser’s bytecode. To +make things worse, these bytecodes will have different semantics, leading to +JS code compiled with different semantics. +While +this is in the realm of the possible – after all, mobile and native developers +do this all the time – this would be a clear regression upon the current JS landscape. + +## …we had a standard JS bytecode? + +So what if the JavaScript VM vendors decided to come up with a novel bytecode +format, possibly as an extension of WebAssembly, but designed specifically for +JavaScript? + +Just to be clear: I have heard people regretting that such a format did not +exist but I am not aware of anybody actively working on this. + +One of the reasons people have not done this yet is that designing and +maintaining bytecode for a language that changes all the time is +quite complicated – doubly so for a language that is already as complex +as JavaScript. More importantly, keeping the interpreted-JavaScript +and the bytecode-JavaScript in touch would most likely be a losing battle, +one that would eventually result in two subtly incompatible JavaScript languages, +something that would deeply hurt the web. + +Also, whether such a bytecode +would actually help code size and performance, remains to be demonstrated. + +## …we just made the parser faster? + +Wouldn’t it be nice if we could *just* make the parser faster? Unfortunately, +while JS parsers have improved considerably, we are long past the point of +diminishing returns. + +Let me quote a few steps that simply cannot be skipped or made infinitely +efficient: + +- dealing with exotic encodings, Unicode byte order marks and other niceties; +- finding out if this `/` character is a division operator, the start of a +comment or a regular expression; +- finding out if this `(` character starts an expression, a list of arguments +for a function call, a list of arguments for an arrow function, …; +- finding out where this string (respectively string template, array, function, +…) stops, which depends on all the disambiguation issues, …; +- finding out whether this `let a` declaration is valid or whether it +collides with another `let a`, `var a` or `const a` declaration – +which may actually appear later in the source code; +- upon encountering a use of `eval`, determine which of the 4 semantics +of `eval` to use; +- determining how truly local local variables are; +- … + +Ideally, VM developers would like to be able to parallelize parsing and/or +delay it until we know for sure that the code we parse is actually used. +Indeed, most recent VMs implement these strategies. Sadly, the numerous +token ambiguities in the JavaScript syntax considerably the opportunities +for concurrency while the constraints on when syntax errors must be thrown +considerably limit the opportunities for lazy parsing. + +In either case, the VM needs to perform an expensive pre-parse step that can +often backfire into being slower than regular parsing, typically when applied +to minified code. + +Indeed, the Binary AST proposal was designed to overcome the performance +limitations imposed by the syntax and semantics of text source JavaScript. + +# What now? + +We are posting this blog entry early because we want you, web developers, +tooling developers to be in the loop as early as possible. So far, the +feedback we have gathered from both groups is pretty good, and we are looking +forward to working closely with both communities. + +We have completed an early prototype for benchmarking purposes (so, not really +usable) and are working on an advanced prototype, both for the tooling and +for Firefox, but we are still a few months away from something useful. + +I will try and post more details in a few weeks time. + +For more reading: + +- [Bug tracking early experiments in Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1349917). +- [ECMA TC-39 Proposal](https://github.com/syg/ecmascript-binary-ast). +- [Tooling](https://github.com/Yoric/binjs-ref) (this is a WIP version of the advanced prototype, but it doesn’t reimplement everything from the early prototype yet). + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From 7af118164cc7ac1a0326de1c8f203ad686a82fb1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 16:52:28 +0800 Subject: [PATCH 12/15] :sparkles: Create dont-use-automatic-image-sliders-or-carousels.md --- ...se-automatic-image-sliders-or-carousels.md | 144 ++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 TODO/dont-use-automatic-image-sliders-or-carousels.md diff --git a/TODO/dont-use-automatic-image-sliders-or-carousels.md b/TODO/dont-use-automatic-image-sliders-or-carousels.md new file mode 100644 index 00000000000..4c1eb55f961 --- /dev/null +++ b/TODO/dont-use-automatic-image-sliders-or-carousels.md @@ -0,0 +1,144 @@ + + > * 原文地址:[Don’t Use Automatic Image Sliders or Carousels](https://conversionxl.com/dont-use-automatic-image-sliders-or-carousels/) + > * 原文作者:[Peep Laja](https://conversionxl.com/author/peep-laja/) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/dont-use-automatic-image-sliders-or-carousels.md](https://github.com/xitu/gold-miner/blob/master/TODO/dont-use-automatic-image-sliders-or-carousels.md) + > * 译者: + > * 校对者: + + # Don’t Use Automatic Image Sliders or Carousels + + [![Don't Use Automatic Image Sliders or Carousels](https://conversionxl.com/wp-content/uploads/2012/09/slider.jpg)](https://conversionxl.com/dont-use-automatic-image-sliders-or-carousels/) + +I’m sure you’ve come across dozens, if not hundreds of image sliders or carousels (also called ‘rotating offers’). You might even like them. But the truth is that they’re conversion killers. + +So if they’re not effective, why do people use them? 2 reasons: + +- Some people think they’re cool. But cool does not make you money – at least not this way. +- Different departments and managers want to get their message on the home page. Design by committee never fails to fail. + +**Note: if you’d like to grow your business by becoming a top data-driven marketer, enroll in [CXL Institute](https://conversionxl.com/institute/). ** + +## What The Tests Say + +I’m not alone. Pretty much any [conversion optimization](https://conversionxl.com/conversion-optimization-guide/) expert that does a lot of tests says the same thing: + +> We have tested rotating offers many times and have found it to be a poor way of presenting home page content. + +[Chris Goward, Wider Funnel](http://www.widerfunnel.com/conversion-rate-optimization/rotating-offers-the-scourge-of-home-page-design) + +> Rotating banners are absolutely evil and should be removed immediately. + +[Tim Ash, Site Tuners](http://www.clickz.com/clickz/column/2164452/rotating-banners) + +Jakob Nielsen (yes, the usability guru) [confirms this in tests](http://www.nngroup.com/articles/auto-forwarding/). They ran a usability study where they gave users the following task: “*Does Siemens have any special deals on washing machines?”. *The information was on the most prominent slide. The users could not see it – totally hit by banner blindness. Nielsen concludes the sliders are ignored. + +Notre Dame university[ tested it](https://vwo.com/blog/image-slider-alternatives/) too. Only the first slide got some action (1%!), other slides hardly got clicked on at all. 1% of clicks for something that takes up (more than) half the page? + +Product design guru Luke Wroblweski summed it up like this: + +[![](https://ws3.sinaimg.cn/large/006tNc79ly1fidkhz15ekj30t60hyq5f.jpg)](https://twitter.com/lukew/status/293857685546360834) + +There’s a [discussion](https://ux.stackexchange.com/questions/10312/are-carousels-effective) about automatic sliders on StackExchange UX. + +Here are some of the things different people who tested them said: + +> Almost all of the testing I’ve managed has proven content delivered via carousels to be missed by users. Few interact with them and many comment that they look like adverts and so we’ve witnessed the banner blindness concept in full effect. +> +> In terms of space saving and content promotion a lot of competing messages get delivered in a single position that can lead to focus being lost. + +[Adam Fellows](https://ux.stackexchange.com/users/5208/adam-fellowes) + +Here’s another one: + +> Carousels are effective at being able to tell people in Marketing/Senior Management that their latest idea is now on the Home Page. +> +> They are next to useless for users and often “skipped” because they look like advertisements. Hence they are a good technique for getting useless information on a Home Page (see first sentence of this post). +> +> In summary, use them to put content that users will ignore on your Home Page. Or, if you prefer, don’t use them. Ever. +> +> btw these views are not my own, but are based upon observing thousands of tests with users. + +[Lee Duddell](https://ux.stackexchange.com/users/7552/lee-duddell) + +And the last one: + +> In all the testing I have done, home page carousels are completely ineffective.For one, anything beyond the initial view has a huge decrease in visitor interaction. And two, the chances that the information being displayed in the carousel matches what the visitor is looking for is slim. So in that case the carousel becomes a very large banner that gets ignored. In test after test the first thing the visitor does when coming to a page with a large carousel is scroll right past it and start looking for triggers that will move them forward with their task. + +[Craig Kistler](https://ux.stackexchange.com/users/7548/craig-kistler) + +Here are 2 main reasons as to **why** it doesn’t work + +## Reason #1: Human Eye Reacts To Movement (and will miss the important stuff) + +Our brains have 3 layers, the oldest part is the one we share even with reptiles. It’s mostly concerned about survival. A sudden change on the horizon could be a matter of life and death. Hence human eye reacts to movement – including constantly moving image sliders and carousels. + +[![eye](https://conversionxl.com/wp-content/uploads/2012/09/eye.jpg)](https://conversionxl.com/wp-content/uploads/2012/09/eye.jpg) + +**That’s good, right?** + +Unless the image slider is the only thing on your website (bad idea!), it’s not a good thing. It means it takes away attention from everything else – the stuff that actually matters. Like your [value proposition](https://conversionxl.com/value-proposition-examples-how-to-create/). Content of your site. Products. + +## Reason #2: Too Many Messages Equals No Message + +Image sliders get hit by banner blindness and most people won’t even pay attention to them, but even those who do can’t really get the messages. + +Visitor lands on your site. Sees a message on the slider – and starts reading it. “This fall you get to …” **Bam!** Gone. Often the sliders are so fast that people are not even able to finish reading them (even if they want to). + +Focusing on a single primary message and action is way always far more effective + +## Reason #3: Banner Blindness + +They look like banners and people just skip over them. + +## User Needs To Be In Control + +Carousels often have [terrible usability](http://uxmovement.com/navigation/big-usability-mistakes-designers-make-on-carousels/) – they move too quickly, have too small navigation icons (if any!) and often move automatically even if the user wants to browse their content manually. One of the key rules of user interface design is that [users need to be in control](http://bokardo.com/principles-of-user-interface-design/). + +These days so many ecommerce sites use rotating offers – and I think it’s not because they tested it, but due to herd mentality – “other have it, so should we”. + +Here’s [Forever21](http://www.forever21.com), guilty as charged – rotating between these 3 offers, changing every 4 seconds: + +[![21](https://conversionxl.com/wp-content/uploads/2012/09/21-1.jpg)](https://conversionxl.com/wp-content/uploads/2012/09/21-1.jpg) + +If the very first offer people see is not what they like (=relevancy), then what? What if they don’t like any of the three? That’s certainly not going to help your [customer lifetime value](https://conversionxl.com/customer-lifetime-value/) get better. + +To their credit, once you touch the slider arrows, the automatic rotation stops. Not only that, but when you come back to their site at a later time, it opens up the slide that you wanted to see. + +I recommend that instead you have a single, static offer. + +Here’s [J.J. Buckley](http://www.jjbuckley.com/) with a static offer – focusing on a single message gets it delivered: + +[![jj](https://conversionxl.com/wp-content/uploads/2012/09/jj.jpg)](https://conversionxl.com/wp-content/uploads/2012/09/jj.jpg) + +Some of the former carousel-users like Adobe, Gap and Hilton have also switched to static messages. + +[Adobe](https://www.adobe.com/): + +[![adobes](https://conversionxl.com/wp-content/uploads/2012/09/adobes.jpg)](https://conversionxl.com/wp-content/uploads/2012/09/adobes.jpg) + +[Gap](http://www.gap.com): + +[![gap](https://conversionxl.com/wp-content/uploads/2012/09/gap.jpg)](https://conversionxl.com/wp-content/uploads/2012/09/gap.jpg) + +Notice that [Hilton](http://www.hilton.com) has an image slider, but it does not move automatically. If you’re gonna do it, that’s the way to go: + +[![hilton](https://conversionxl.com/wp-content/uploads/2012/09/hilton.jpg)](https://conversionxl.com/wp-content/uploads/2012/09/hilton.jpg) + +## **Conclusion** + +If you can, avoid them. Don’t follow the fad (it will pass), follow the money instead. + +So what do you do instead? You either use static images or this: + +[![](https://ws2.sinaimg.cn/large/006tNc79ly1fidkitq5yjj30te0j0n05.jpg)](https://twitter.com/erunyon/status/293868617886486529) + +Brad Frost acknowledged “*Even though carousels aren’t that effective, I somehow don’t think they’re going away any time soon*” and wrote this piece on how to [make the carousel work better](http://bradfrostweb.com/blog/post/carousels/). + +What’s your experience with carousels – both as a website owner and a user? + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From 7f132ad4b13ed0e765662dd07e00d6ae6f04035f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 17:36:51 +0800 Subject: [PATCH 13/15] :sparkles: Create error-handling-in-rxjava.md --- TODO/error-handling-in-rxjava.md | 129 +++++++++++++++++++++++++++++++ 1 file changed, 129 insertions(+) create mode 100644 TODO/error-handling-in-rxjava.md diff --git a/TODO/error-handling-in-rxjava.md b/TODO/error-handling-in-rxjava.md new file mode 100644 index 00000000000..407cdf822c0 --- /dev/null +++ b/TODO/error-handling-in-rxjava.md @@ -0,0 +1,129 @@ + + > * 原文地址:[Error handling in RxJava](https://rongi.github.io/kotlin-blog/rxjava/rx/2017/08/01/error-handling-in-rxjava.html) + > * 原文作者:[Dmitry Ryadnenko](https://twitter.com/KotlinBlog) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/error-handling-in-rxjava.md](https://github.com/xitu/gold-miner/blob/master/TODO/error-handling-in-rxjava.md) + > * 译者: + > * 校对者: + + # Error handling in RxJava + + ![Drawing](https://rongi.github.io/kotlin-blog/assets/error-handling-in-rxjava-title.jpg) + +Once you start writing RxJava code you realize that some things can be done in different ways and sometimes it’s hard to identify best practices right away. Error handling is one of these things. + +So, what is the best way to handle errors in RxJava and what are the options? + +## Handling errors in onError consumer + +Let’s say you have an observable that can produce an exception. How to handle that? First instinct is to handle errors directly in `onError` consumer. + + userProvider.getUsers().subscribe( + { users -> onGetUsersSuccess(users) }, + { e -> onGetUsersFail(e) } // Stop the progress, show error message, etc. + ) + +It’s similar to what we used to do with `AssyncTasks` and looks pretty much like a try-catch block. + +There is one big problem with this though. Say there is a programming error inside `userProvider.getUsers()` observable that leads to `NullPointerException` or something like this. It’ll be super convenient here to crash right away so we can detect and fix the problem on the spot. But we’ll see no crash, the error will be handled as an expected one: an error message will be shown, or in some other graceful way. + +Even worse is that there wouldn’t be any crash in the tests. The tests will just fail with mysterious unexpected behavior. You’ll have to spend time on debugging instead of seeing the reason right away in a nice call stack. + +## Expected and unexpected exceptions + +Just to be clear let me explain what do I meant here by expected and unexpected exceptions. + +Expected exceptions are those that are expected to happen in a bug-free program. Examples here are various kinds of IO exceptions, like no network exception, etc. Your software is supposed to react on these exceptions gracefully, showing error messages, etc. Expected exceptions are like second valid return value, they are part of method’s signature. + +Unexpected exceptions are mostly programming errors. They can and will happen during development, but they should never happen in the finished product. At least it’s a goal. But if they do happen, usually it’s a good idea just to crash the app right away. This helps to raise attention to the problem quickly and fix it as soon as possible. + +In Java expected exceptions are mostly implemented using checked exceptions (subclassed directly from `Exception` class). The majority of unexpected ones are implemented with unchecked exceptions and derived from `RuntimeException`. + +## Crashing on RuntimeExceptions + +So, if we want to crash why don’t just check if the exception is a `RuntimeException` and rethrow it inside `onError` consumer? And if it’s not just handle it like we did it in the previous example? + + userProvider.getUsers().subscribe( + { users -> onGetUsersSuccess(users) }, + { e -> + if (e is RuntimeException) { + throw e + } else { + onGetUsersFail(e) + } + } + ) + +This one may look nice, but it has a couple of flaws: + +1. In RxJava 2 this will crash in the live app but not in the tests. Which can be extremely confusing. In RxJava 1 though it will crash both in the tests and in the application. +2. There are more unchecked exceptions besides `RuntimeException` that we want to crash on. This includes `Error`, etc. It’s hard to track all exceptions of this kind. + +But the main flaw is this: + +During application development your Rx chains will become more and more complex. Also your observables will be reused in different places, in the contexts you never expected them to be used in. + +Imagine you’ve decided to use `userProvider.getUsers()` observable in this chain: + + Observable.concat(userProvider.getUsers(), userProvider.getUsers()) + .onErrorResumeNext(just(emptyList())) + .subscribe { println(it) } + +What will happen if both `userProvider.getUsers()` observables emit an error? + +Now, you may think that both errors will be mapped to an empty list and so two empty lists will be emitted. You may be surprised to see that actually only one list is emitted. This is because error occurred in the first `userProvider.getUsers()` will terminate the whole chain upstream and second parameter of `concat` will never be executed. + +You see, errors in RxJava are pretty destructive. They are designed as fatal signals that stop the whole chain upstream. They aren’t supposed to be part of interface of your observable. They perform as unexpected errors. + +Observables designed to emit errors as a valid output have limited scope of possible use. It’s not obvious how complex chains will work in case of error, so it’s very easy to misuse this kind of observables. And this will result in bugs. Very nasty kind of bugs, those that are reproducible only occasionally (on exceptional conditions, like lack of network) and don’t leave stack traces. + +## Result class + +So, how to design observables that return expected errors? Just make them return some kind of `Result` class, which will contain either result of the operation or an exception. Something like this: + + data class Result( + val data: T?, + val error: Throwable? + ) + +Wrap all expected exceptions into this and let all unexpected ones fall through and crash the app. Avoid using `onError` consumers, let RxJava do the crashing for you. + +Now, while this approach doesn’t looks particularly elegant or intuitive and produces quite a bit of boilerplate, I’ve found that it causes the least amount of problems. Also, it looks like this is an “official” way to do error handling in RxJava. I saw it recommended by RxJava maintainers in multiple discussions across Internet. + +## Some useful code snippets + +To make your Retrofit observables return `Result` you can use this handy extension function: + + fun Observable.retrofitResponseToResult(): Observable> { + return this.map { it.asResult() } + .onErrorReturn { + if (it is HttpException || it is IOException) { + return@onErrorReturn it.asErrorResult() + } else { + throw it + } + } + } + + fun T.asResult(): Result { + return Result(data = this, error = null) + } + + fun Throwable.asErrorResult(): Result { + return Result(data = null, error = this) + } + +Then your `userProvider.getUsers()` observable can look like this: + + class UserProvider { + fun getUsers(): Observable>> { + return myRetrofitApi.getUsers() + .retrofitResponseToResult() + } + } + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From de47c2d1608dab12e5a18b5ada4233f5cdca3252 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 17:51:44 +0800 Subject: [PATCH 14/15] :sparkles: Create your-node-js-authentication-tutorial-is-wrong.md --- ...ode-js-authentication-tutorial-is-wrong.md | 126 ++++++++++++++++++ 1 file changed, 126 insertions(+) create mode 100644 TODO/your-node-js-authentication-tutorial-is-wrong.md diff --git a/TODO/your-node-js-authentication-tutorial-is-wrong.md b/TODO/your-node-js-authentication-tutorial-is-wrong.md new file mode 100644 index 00000000000..cefa127d4c9 --- /dev/null +++ b/TODO/your-node-js-authentication-tutorial-is-wrong.md @@ -0,0 +1,126 @@ + + > * 原文地址:[Your Node.js authentication tutorial is (probably) wrong](https://medium.com/@micaksica/your-node-js-authentication-tutorial-is-wrong-f1a3bf831a46) + > * 原文作者:[micaksica](https://medium.com/@micaksica) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/your-node-js-authentication-tutorial-is-wrong.md](https://github.com/xitu/gold-miner/blob/master/TODO/your-node-js-authentication-tutorial-is-wrong.md) + > * 译者: + > * 校对者: + + # Your Node.js authentication tutorial is (probably) wrong + + **tl;dr: **I went on a search of Node.js/Express.js authentication tutorials. All of them were incomplete or made a security mistake in some way that can potentially hurt new users. This post explores some common authentication pitfalls, how to avoid them, and what to do to help yourself when your tutorials don’t help you anymore. I am still searching for a robust, all-in-one solution for authentication in Node/Express that rivals Rails’s [Devise](https://github.com/plataformatec/devise). + +> **Update (Aug 7)**: RisingStack has reached out and [no longer stores passwords in plaintext](https://github.com/RisingStack/nodehero-authentication/commit/9d69ea70b68c4971466c64382e5f038e3eda8d8a) in their tutorial, opting to move to bcrypt in their example codes and tutorials. +> **Update (Aug 8):** Editing title to *Your Node.js authentication tutorial is (probably) wrong*, as this post has improved some of these tutorials. + +On my spare time, I’ve been digging through various Node.js tutorials, as it seems that every Node.js developer with a blog has released their own tutorial on how to do things *the right way*, or, more accurately, *the way they do them*. Thousands of front-end developers being thrown into the server-side JS maelstrom are trying to piece together actionable knowledge from these tutorials, either by cargo-cult-copypasta or gratuitous use of *npm install *as they scramble frantically to meet the deadlines set for them by outsourcing managers or ad agency creative directors. + +One of the more questionable things in Node.js development is that authentication is largely left as an exercise to the individual developer*. *The *de facto *authentication solution in the Express.js world is [Passport](http://passportjs.org/), which offers a host of *strategies* for authentication. If you want a robust solution similar to [Plataformatec’s Devise](https://github.com/plataformatec/devise) for Ruby on Rails, you’ll likely be pointed to [Auth0](https://auth0.com/), a startup who has made authentication as a service. + +Compared to Devise, Passport is simply authentication middleware, and does not handle any of the other parts of authentication for you: that means the Node.js developer is likely to roll their own API token mechanisms, password reset token mechanisms, user authentication routes and endpoints, and views in whatever templating language is the rage today. Because of this, there are a lot of tutorials that specialize in setting up Passport for your Express.js application, and nearly all of them are wrong in some way or another, and none properly implement the full stack necessary for a working web application. + +> **Note**: I’m not attempting to harass the developers of these tutorials specifically, but rather I am using their authentication mistakes to show security issues inherent in rolling your own authentication systems. If you are a tutorial writer, feel free to reach out to me once you’ve updated your tutorial. Let’s make Node/Express a safer ecosystem for new developers. + +### Mistake one: credential storage + +Let’s start with credential storage. Storing and recalling credentials is pretty standard fare for identity management, and the traditional way to do this is in your own database or application. Passport, being middleware that simply says “this user is cool” or “this user is not cool”, requires the [passport-local](https://github.com/jaredhanson/passport-local) module for handling password storage in your own database, written by the same developer as Passport.js itself. + +Before we go down this tutorial rabbit hole, let’s remind ourselves of a [great cheat sheet for password storage](https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet) by OWASP, which boils down to “store high-entropy passwords with unique salts and one-way adaptive cost functions”. Or, really, Coda Hale’s [bcrypt meme](https://codahale.com/how-to-safely-store-a-password/), even though [there’s some contention](https://security.stackexchange.com/a/6415). + +As a new Express.js and Passport user, my first place to look will be the example code for *passport-local* itself, which [thankfully gives me a sample Express.js 4.0 application](https://github.com/passport/express-4.x-local-example) I can clone and extend. However, if I just copypasta this, I’m not left with too much, as there’s no database support in the example and it assumes I’m just using some set accounts. + +That’s OK, though, right? *It’s just an Intranet application*, the dev says, *and I have four other projects assigned to me due next week*. Of course, the passwords for the example aren’t hashed in any way, [and stored in plaintext right alongside the validation logic in this example](https://github.com/passport/express-4.x-local-example/blob/master/db/users.js). Credential storage isn’t even considered in this one. + +Let’s google for another tutorial using *passport-local*. I find[ this quick tutorial from RisingStack in a series](https://blog.risingstack.com/node-hero-node-js-authentication-passport-js/) called “Node Hero”, but that doesn’t help me, either. They, too, [gave me a sample application on GitHub](https://github.com/RisingStack/nodehero-authentication), but it had [the same problems as the official one](https://github.com/RisingStack/nodehero-authentication/blob/7f808f5c8ea756155099b7b4a88390c356cf31be/app/authentication/init.js#L8). **(*Ed. 8/7/17: *RisingStack is**[**now using bcrypt**](https://github.com/RisingStack/nodehero-authentication/commit/9d69ea70b68c4971466c64382e5f038e3eda8d8a)** in their tutorial application.)** + +Next up, [here’s the fourth result](http://mherman.org/blog/2015/01/31/local-authentication-with-passport-and-express-4/) from Google for *express js passport-local tutorial*, written in 2015. It uses the Mongoose ODM and actually reads the credentials from my database. This one really has everything, including integration tests and, yes, another boilerplate you can use. However, the Mongoose ODM [also stores password as type *String*](https://github.com/mjhea0/passport-local-express4/blob/master/models/account.js#L7)*, *so these passwords are also stored in plaintext, only this time on the MongoDB instance. ([Everyone knows MongoDB instances are generally *very *secure](https://www.shodan.io/report/nlrw9g59).) + +You could accuse me of cherry-picking tutorials, and you’d be right, if cherry picking means selecting from the first page of Google results. Let’s choose the [higher-ranked-in-results *passport-local* tutorial from TutsPlus](https://code.tutsplus.com/tutorials/authenticating-nodejs-applications-with-passport--cms-21619). This one is better, in that [it uses brypt with a cost factor of 10 for password hashing,](https://github.com/tutsplus/passport-mongo/blob/master/passport/login.js) and defers the synchronous bcrypt hash checks using *process.nextTick*. The top result on Google, [the tutorial from scotch.io](https://scotch.io/tutorials/easy-node-authentication-setup-and-local), also uses [bcrypt with a lesser cost factor of 8](https://github.com/scotch-io/easy-node-authentication/blob/local/app/models/user.js#L37). Both of these are small, but 8 is really small. Most *bcrypt* libraries these days use 12. [The cost factor of 8 was for administrator accounts *eighteen years ago*](https://www.usenix.org/legacy/publications/library/proceedings/usenix99/provos/provos_html/node6.html)when the original bcrypt paper was released. + +Password storage aside, neither of these tutorials implement password reset functionality, which is left as an exercise to the developer and comes with its own pitfalls. + +### Mistake two: password reset + +A sister security issue to password storage is that of password reset, and none of the top basic tutorials explain how to do this at all with Passport. You’ll have to follow another. + +There are a thousand ways to fuck this up. The most common ways I have witnessed that people get password reset wrong are: + +1. **Predictable tokens. **Tokens that are based upon the current time are a good example. Tokens made by bad pseudorandom number generators are less obvious. +2. **Bad storage. **Storing unencrypted password reset tokens in your DB means that if the DB is compromised, those tokens are effectively plaintext passwords. Generating a long token with a cryptographically secure random number generator stops remote brute force attacks on reset tokens, but it doesn’t stop local attacks. Reset tokens are credentials and should be treated as such. +3. **No token expiry. **Not expiring your tokens gives attackers more time to exploit the reset window. +4. **No secondary data verification.** Security questions are the *de facto* data verification for a reset. Of course, then the developer has to choose *good security questions*. [Security questions have their own problems](https://www.kaspersky.com/blog/security-questions-are-insecure/13004/). While this may seem like security overkill, the email address is something you have, not something you know, and conflates the authentication factors. Your email address becomes the key to every account that just sends a reset token to email. + +If you’re new all of this, try OWASP’s [Password Reset Cheat Sheet](https://www.owasp.org/index.php/Forgot_Password_Cheat_Sheet). Let’s get back to what the Node world has to offer for us on this. + +We’ll divert to *npm *for a second and [look for password reset](https://www.npmjs.com/search?q=password%20reset&page=1&ranking=popularity), to see if anyone’s made this. There’s a five-year-old package from the (generally awesome) substack. On the Node.js timeline this module is jurassic, and if I wanted to nitpick, [Math.random() is predictable in V8](https://security.stackexchange.com/questions/84906/predicting-math-random-numbers), so [it shouldn’t be used for token generation](https://github.com/substack/node-password-reset/blob/master/index.js#L73). Also, it doesn’t use Passport, so we’ll move on. + +Stack Overflow isn’t of too much help, as developer relations from a company called Stormpath loved plugging their IaaS startup on every imaginable post regarding this. [Their documentation also popped up everywhere](https://docs.stormpath.com/client-api/product-guide/latest/password_reset.html) and they have [a blogvertisement on password reset, as well](https://stormpath.com/blog/the-pain-of-password-reset). However, all of this is for naught as Stormpath is defunct, [and it shuts down entirely](https://stormpath.com/) August 17, 2017. + +Alright, back to Google, for the only tutorial that seems to exist out there. We’ll take [the first result](http://sahatyalkabov.com/how-to-implement-password-reset-in-nodejs/) for the Google search *express passport password reset. *Here is our old friend *bcrypt* again, with an even smaller cost factor of 5 used in the text, which is *far* too small of a cost factor for modern use. + +However, this tutorial is pretty solid compared to others in that it uses *crypto.randomBytes* to generate truly random tokens, and expires them if they aren’t used. However #2 and #4 of the practices above aren’t honored by this comprehensive tutorial, and thus the password tokens themselves are vulnerable to authentication mistake number one, credential storage. + +Thankfully, this is of limited use thanks to the reset expiry. However, these tokens are especially fun if an attacker has read access to the user objects in the DB via BSON injection or can access Mongo freely due to misconfiguration. The attacker can just issue password resets for every user, read the unencrypted tokens from the DB, and set their own passwords for user accounts instead of having to go through the costly process of dictionary attacks on bcrypt hashes with a GPU rig. + +### Mistake three: API tokens + +API tokens are credentials. They are just as sensitive as passwords or reset tokens. Most every developer knows this and tries to hold their AWS keys, Twitter secrets, etc. close to their chest, however this doesn’t seem to transfer into the code being authored. + +Let’s use [JSON Web Tokens](https://jwt.io/) for API credentials. Having a stateless, blacklistable, claimable token is better than the old API key/secret pattern that has been used for the better part of a decade. Perhaps our junior Node.js dev has heard of JWT somewhere before, or saw *passport-jwt* and decided to implement the JWT strategy. In any case, JWT is where everyone seems to be moving in the Node.js sphere of influence. (The venerable [Thomas Ptacek will argue that JWT is bad](https://news.ycombinator.com/item?id=13866883) but I’m afraid that ship has sailed here.) + +We’ll search for *express js jwt* on Google, and then find [Soni Pandey](https://medium.com/@pandeysoni)’s tutorial [*User Authentication using JWT (JSON Web Token) in Node.js*](https://medium.com/@pandeysoni/user-authentication-using-jwt-json-web-token-in-node-js-using-express-framework-543151a38ea1)which is the first tutorial result. Unfortunately, this doesn’t actually help us at all, since it doesn’t use Passport, but while we’re here we’ll quickly note the mistakes in credential storage: + +1. We’ll store the [JWT key in plaintext in the repository](https://github.com/pandeysoni/User-Authentication-using-JWT-JSON-Web-Token-in-Node.js-Express/blob/master/server/config/config.js#L13). +2. We’ll [use a symmetric cipher to store passwords](https://github.com/pandeysoni/User-Authentication-using-JWT-JSON-Web-Token-in-Node.js-Express/blob/master/server/config/common.js#L54). This means that I can get the encryption key and decrypt all of the passwords in event of a breach. The encryption key is shared with the JWT secret. +3. We’ll use AES-256-CTR for password storage. We shouldn’t be using AES to start, and this mode of operation doesn’t help. I am not sure why this mode specifically was chosen, but [the choice alone leaves the ciphertext malleable](https://crypto.stackexchange.com/a/33861). + +Welp. Let’s back out to Google and we find the next tutorial. Scotch, which did an OK job with password storage in their passport-local tutorial, [just ignores what they told you before and stores the passwords in plaintext](https://github.com/scotch-io/node-token-authentication/blob/master/app/models/user.js#L7) for this example. + +Uh, we’ll give that a pass for brevity, but it doesn’t help the copypasta crew. That’s because more interestingly, this tutorial [also serializes the mongoose User object into the JWT](https://github.com/scotch-io/node-token-authentication/blob/master/server.js#L81). + +Let’s clone the Scotch tutorial repository, follow the instructions, and run it. After a *DeprecationWarning *or three from Mongoose, we can hit [*http://localhost:8080/setup*](http://localhost:8080/setup)to create the user, then get a token by posting to /api/authenticate with the default credentials of “Nick Cerminara” and “password”. A token is returned, as displayed from Postman. + +![](https://cdn-images-1.medium.com/max/1600/1*wvb2F4-Rx4I1ji2EJIyXZg.png) + +A JWT token returned from the Scotch tutorial. + +Note that JSON Web Tokens are signed, but not encrypted. That means that big blob between the two periods is a Base64-encoded object. Quickly decoding it, we get something interesting. + +![](https://cdn-images-1.medium.com/max/1600/1*5KcDyNtIfWXVe9uVUD0A_g.png) + +I love my passwords in plaintext in tokens. +Now, anyone that has *even an expired token* has your password, as well as whatever else is stored in the Mongoose model. Given that this one came over HTTP, I could have sniffed it off of the wire. + +What about the next tutorial? The next tutorial, [*Express, Passport and JSON Web Token (jwt) Authentication for Beginners*](https://jonathanmh.com/express-passport-json-web-token-jwt-authentication-beginners/)*, *contains the same information disclosure vulnerability*. *The next tutorial from a startup called [SlatePeak does the same serialization](http://blog.slatepeak.com/creating-a-simple-node-express-api-authentication-system-with-passport-and-jwt/). At this point, I gave up looking. + +### Mistake four: rate limiting + +As I alluded to above, I did not find a mention of rate limiting or account locking in any of these authentication tutorials. + +Without rate limiting, an adversary can perform online dictionary attacks in which a tool like [Burp Intruder](https://portswigger.net/burp/help/intruder_using.html) is run in hopes of gaining access to an account with a weak password. Account lockout also helps with this problem by requiring extended login information from a user the next time they log in. + +Remember, rate limiting also helps availability. *bcrypt* is a CPU-intensive function, and without rate limiting functions using bcrypt become an application-level denial of service vector, especially at high work factors. Multiple requests for user registration or login password checking are an easy way to turn a lightweight HTTP request into costly time for your server. + +While I do not have a tutorial I can point to for these, there are tons of rate limiting middlewares for Express, such as [express-rate-limit](https://github.com/nfriedly/express-rate-limit), [express-limiter](https://www.npmjs.com/package/express-limiter), and [express-brute](https://github.com/AdamPflug/express-brute). I cannot speak to the security of these modules and have not even looked at them; generally I [recommend running a reverse proxy in production](https://expressjs.com/en/advanced/best-practice-performance.html#use-a-reverse-proxy) and allowing [rate limiting to requests to be handled by nginx](https://www.nginx.com/blog/rate-limiting-nginx/) or whatever your load balancer is. + +### Authentication is hard. + +I’m sure the tutorial developers will defend themselves with “This is just meant to explain the basics! Surely nobody will do this in production!” However, I cannot emphasize enough *just how false this is*. This is *especially* true when code is put out there in your tutorials. People will take your word for it — after all, you *do* have more expertise than they do. + +**If you’re a beginner, don’t trust your tutorials.** Copypasta from tutorials *will* likely get you, your company, and your clients in authentication trouble in the Node.js world. If you really need strong, production-ready, all-in-one authentication libraries, go back to something that holds your hand better, has better stability, and is more proven, like Rails/Devise. + +The Node.js ecosystem, while accessible, still has a lot of sharp edges for JavaScript-based developers needing to write production web applications in a hurry. If you have a front-end background and don’t know other programming languages, I personally believe it is easier to pick up Ruby and stand on the shoulders of giants than it is to quickly learn how not to shoot yourself in the foot when writing these types of things from scratch. + +If you’re a tutorial writer, *please* update your tutorials, *especially *the boilerplate code. This code will become copypasta into others’ production web applications. + +If you are a die-hard Node.js developer, hopefully you’ve learned a few things not to do in your authentication system you’re rolling with Passport. You will likely get something wrong. I haven’t gotten close to covering all of the ways to get it wrong in this one post. It shouldn’t be your job to roll your own auth for your Express application. There should be something better. + +If you’re interested in better securing the Node ecosystem, please DM me [@_micaksica](https://twitter.com/_micaksica) on Twitter. + +> This post was brought to you by espresso because I’m out of sake. + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file From 4d280204b5433f234d8fe2054a0c18199c28f87e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A0=B9=E5=8F=B7=E4=B8=89?= Date: Wed, 9 Aug 2017 17:59:51 +0800 Subject: [PATCH 15/15] :sparkles: Create building-account-systems.md --- TODO/building-account-systems.md | 174 +++++++++++++++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 TODO/building-account-systems.md diff --git a/TODO/building-account-systems.md b/TODO/building-account-systems.md new file mode 100644 index 00000000000..8446f788441 --- /dev/null +++ b/TODO/building-account-systems.md @@ -0,0 +1,174 @@ + + > * 原文地址:[Building account systems](https://blog.plan99.net/building-account-systems-f790bf5fdbe0) + > * 原文作者:[Mike Hearn](https://blog.plan99.net/@octskyward) + > * 译文出自:[掘金翻译计划](https://github.com/xitu/gold-miner) + > * 本文永久链接:[https://github.com/xitu/gold-miner/blob/master/TODO/building-account-systems.md](https://github.com/xitu/gold-miner/blob/master/TODO/building-account-systems.md) + > * 译者: + > * 校对者: + + # Building account systems + + ![](https://cdn-images-1.medium.com/max/1600/1*gMIGLbIgwnSC8huyC5ugKQ.jpeg) + +[Troy Hunt](https://www.troyhunt.com/) recently published a blog post titled “[Authentication guidance for the modern era](https://www.troyhunt.com/passwords-evolved-authentication-guidance-for-the-modern-era/)”. It has a big pile of solid advice on what password rules your website should use, with references to formal government recommendations — always useful for convincing colleagues or a boss. + +One of the projects I worked on during my time at Google was their unified account system ([specifically, anti-hijacking](https://googleblog.blogspot.ch/2013/02/an-update-on-our-war-against-account.html)). Login systems are a part of most websites, so reading Troy’s article inspired me to put together some advice for building them. + +### 1. Ideally, don’t + +Whatever your business is, user authentication is not your core competency. Modern login systems are expected to do a lot. Passwords are only the start. If you are successful you will eventually also want: + +- Forgotten password recovery +- Email address verification +- Log out, which is harder than it looks (see below) +- Password brute forcing protection +- Two-factor authentication via SMS, phone app and hardware key +- Account hijacking protection (when an attacker already knows the correct password and the user does not have 2FA) +- Region/language/name/profile photo preferences +- Support for mobile/desktop sign-in +- Unusual activity notifications +- Phone-only signin + +As big companies raise user expectations and attackers get better, the effort to keep up is becoming impractical. Fortunately, you can outsource your authentication to those companies using OAuth. + +Often web developers see adding a “[sign in with Facebook](https://developers.facebook.com/docs/facebook-login)” or “[sign in with Google](https://developers.google.com/identity/sign-in/web/sign-in)” button as a kind of optional nice-to-have, which comes only after building their own account system. If you’re reading this because you’re starting a new website from scratch, I argue that “Sign in with …” should be the only option you offer. Building your own account system these days is like building your own datacenters instead of using AWS. It’s an expensive distraction. + +Sometimes people worry that if they only offer “Sign in with …”, the big ID providers might one day try to steal their customers. The usual indication someone’s worrying about this is you sign in with an ID provider and are then asked to set a password anyway. Don’t worry about this — in the unlikely event that this happens, you can always migrate your customer base to a new system of your own later by simply emailing them a link. + +### 2. Use email / phone numbers to identify users + +Don’t ask users to pick usernames, even if you want a username-oriented experience like on a chat forum. Users are always identified to you by email address, phone number or both, and if you want a different name for user-to-user identification (a display name), that should be chosen separately. Why? + +- You will ask for the email address anyway. +- If the username becomes a form of self-expression on your service, users will want to change it from time to time. +- Users forget usernames but not their email/phone numbers. +- Picking a username is frustrating. Every time a user selects a name that’s already taken, some will give up and your customer acquisition pipeline narrows. +- Separating username from display name will reduce the temptation to impose arbitrary restrictions on how users appear, like forbidding spaces. + +### 3. Don’t use passwords at all + +![](https://cdn-images-1.medium.com/max/1600/1*1S1yaiqUAmfLZE2uF5AvDw.jpeg) + +If you aren’t quite ready to rely 100% on third party ID providers, at least do everyone a favour and don’t make users pick a password at all. + +This isn’t as stupid as it sounds. You are already asking the user for their email address. The very first feature you will add to your login system after going live is forgotten password recovery, which will work by sending the user a clickable link via an email. Therefore anyone who can read your user’s email can log in as them anyway and your own site password adds no extra security. + +Instead, skip the first step and go straight to the second — your login system can be as simple as emailing the user a link that sets a login cookie when clicked. [Medium.com is an example of a site that does this](https://blog.medium.com/signing-in-to-medium-by-email-aacc21134fcd). + +This approach works as long as every device where the user might log in to your service has an email client. This is true for desktops, laptops, phones and tablets. It is not true for games consoles or TVs, but you probably aren’t targeting them yet. If you are it’s better to use a Bluetooth style pairing process anyway, as these devices don’t have convenient keyboards. + +There have been suggestions in the past that users can be confused by the lack of a password entry box. But the modern Google sign-in experience starts by asking the user for their email address only, so it’s unlikely users are confused by this any more … and the benefits are huge. + +This approach has an additional benefit: some users have phone numbers but not email addresses. This is especially true in developing countries, so if this is a possible target market for your website you may eventually want to support users who can only log in by receiving a code to their phone. Such accounts won’t have passwords at all, so if you assumed all users do have passwords it will require you to go back and add lots of special cases to security-sensitive code paths (this can easily lead to fatal mistakes). + +### 4. Don’t use secret questions + +If you simply must use passwords — perhaps you don’t want to try explaining to your boss why you did things differently — at the very least don’t let the user recover using secret questions and answers. + +- The answers to secret questions are often trivially guessed. Users find it incredibly hard to think of questions that only they could answer and nobody else. +- Pre-supplied questions make the guessing problem worse. +- Pre-supplied questions often have cultural bias that makes them useless for many users (e.g. “what was your high school’s mascot?”) +- Some savvy users realise they can’t think of a hard-to-guess answer so just use it as a second password field, meaning they then can’t recover when they forget. +- There is a long list of high profile celebrity and VIP hacks that worked by abusing password recovery flows. You don’t want this to be you. + +Google had severe problems with secret questions. [A couple of my old colleagues published research on it](https://security.googleblog.com/2015/05/new-research-some-tough-questions-for.html) that’s worth reading or watching (video below) + +[![](https://i.ytimg.com/vi_webp/h8YwQvJm7rk/maxresdefault.webp)](https://www.youtube.com/embed/h8YwQvJm7rk) + +A talk on secret questions/answers at Google +Some examples of problematic Q/As: + +- **Q: Favourite food. A: Pizza.** The answer is always pizza. You can break into around 20% of English-speaking accounts with just a single guess using this answer. With just 10 guesses you can break in to over a third of all English-speaking accounts that have this question. For Korean accounts you can get into 43% of all accounts within 10 guesses. +- **Q: On what day did I get married?****A: Thursday.** A custom question, but fatally flawed — the attacker needed only 5 guesses, too few to be detected as a brute forcing attempt. +- **Q: In which city was I born?****A: Seoul.** In some countries almost everyone lives in a handful of large cities. Observing the language the ID verification UI is written in can thus narrow down the list of possible cities dramatically. For Korean accounts you only need 10 guesses to break 40% of accounts using this question. +- **Q: What was the name of my first teacher? A: Mr Smith, Smith, John, John S. Smith, JOHN SMITH, Jon Smth.** All of these answers are correct but won’t be matched by a straightforward implementation. I added fuzzy matching for questions like these because users so often got them ‘nearly’ right. The matching logic often needs to understand something about the question (Levenshtein distance by itself is insufficient for things like street addresses). Good luck doing this for all the languages your product supports! + +Not surprisingly, professional account systems do not use knowledge of secret Q/A alone to allow users to recover accounts. It’s just one signal amongst many. I give you a <2% chance of writing something sophisticated enough to get this right. That’s why Google phased secret Q/A out in favour of SMS recovery. SMS based recovery has its own issues, but it’s still a lot better than secret questions. + +### 5. Avoid CAPTCHAs + +CAPTCHAs are a frequent feature of many login forms. They’re also something I did a bit of work on at Google. Unfortunately, CAPTCHAs are these days extremely low value and often implemented badly. + +![](https://cdn-images-1.medium.com/max/1600/1*_RLdNjTDj6VzHRsIit5ODg.gif) + +All these CAPTCHAs are uselessly weak + +The thing to understand about CAPTCHAs is they’re only useful for imposing a very basic throttle on automated attacks. They will not protect your account system against bulk registrations. Other than account security I also spent some years working on Google signup abuse. We routinely saw spammers solve tens of millions of our hardest CAPTCHAs. There are professional CAPTCHA solving firms like [DeathByCaptcha](http://www.deathbycaptcha.com/user/login) that use a mix of OCR and human solving. Ordinary CAPTCHAs block blind people from signing up, which may be a problem, but speech recognition based CAPTCHAs are either trivially solved by computers or unsolvable by humans. + +CAPTCHAs are most useful for blocking password brute forcing attempts. Brute forces may require hundreds of thousands or millions of attempts against an account to find the right password. A simple way to stop them without annoying users is to start throwing CAPTCHAs if the user has had some recent failed login attempts. Even easy CAPTCHAs are enough to throw a small delay into a bot loop. + +CAPTCHAs are much less useful for stopping bulk account registration. Building systems to detect and stop that is a whole other ball game; one I spent several years playing. To get a sense of how hard it can be, go to [buyaccs.com](http://buyaccs.com) and observe the huge variation in prices charged by underground account sellers. The higher prices are caused by better defence systems. Unless you’re one of the Big 5 you won’t be able to beat the work we did on account signup security — it’s just one more reason to outsource login to the major players. + +If you *still* want to use CAPTCHAs, use [reCAPTCHA](https://www.google.com/recaptcha/intro/) and make sure your CAPTCHA is bound appropriately to avoid replay attacks. Don’t try to roll your own or use a kit you found on GitHub. Such CAPTCHAs are invariably solved by modern OCR and will accomplish nothing except reducing the success rate of customer signup. + +### 6. Outsource 2-factor authentication + +![](https://cdn-images-1.medium.com/max/1200/1*GmSsoIZQN49cIBMeNDoYlA.png) + +Two-factor auth is a pretty common feature these days. Again, doing this well is hard, expensive and you do not want to implement it yourself. + +- SMS is unreliable, especially in some countries. Recovery codes will occasionally just not show up. You will eventually want to implement phone calls with speech synthesis as a result because phone calls are much more reliable, but now you need multi-lingual speech synthesis engines. +- Doing lots of SMSs or phone calls is extremely expensive, even if you can negotiate good bulk discounts. +- People lose access to their phone numbers all the time. If you rely on email addresses your password recovery flow can be pretty easy, but once solid 2FA is introduced your password recovery becomes the weakest point in the system. If you don’t upgrade it attackers will simply go around it. If you block it, then you will discover that … +- 2FA can be abused by attackers who add it to accounts they phished or hacked. This is to prevent the real users from stealing the account back whilst the malicious activity is performed. +- Phone numbers are vulnerable to porting attacks, so the trend is towards asking users to set up mobile apps or security keys. Implementing those is even more work, and of course both of those can be lost too, so you will ultimately still need some customer support flow to help them recover. +- As you’re figuring out, 2FA adds a lot of manual customer support work because you can no longer just push users towards email or secret Q/A based recovery. That’s expensive. + +Some of these problems are fundamental, but most of them are solved already by the big players who will pay the phone bills and customer support people on your behalf, for free! + +Still, if you don’t want to use them, there are startups that will solve small parts of the 2FA puzzle for you. + +### 7. Don’t force password changes + +Troy already covered this just fine so I won’t repeat it here, except to say that this is really important. Don’t require users to change passwords just because it’s been a while. + +- Some users won’t make it through the process and you will bleed users. +- Some users will be smarter than you and use tricks like changing their password (once, twice, three times) and then immediately changing it back to their old password, meaning you will end up wanting to store a history of recent passwords to prevent this behaviour. But I bet your first implementation won’t do this. +- It doesn’t improve security anyway. + +### 8. Don’t expire sessions + +Yet another best practice that isn’t. It’s tempting to set your session cookies to expire. Sometimes people think that this improves security for the same reason they think expiring people’s passwords improves security. + +- Attackers tend to perform malicious activities immediately, so expirations don’t help much. +- Session expiry trains users that random unexpected password prompts are normal, which makes them incredibly easy to phish. +- Sessions that expire randomly create an explosion of bugs that your developers will waste large amounts of time on. Most parts of your website will not be written to handle the case of sessions expiring half way through an action, so you’ll have to go back and fix them, assuming you even notice the problems at all. Expiry tends to surface as user reports of random flakiness which are hard to track down. + +### 9. Remember sign-out + +Getting sign-out wrong is remarkably common in immature account systems. It sounds superficially easy but the most obvious ways to do it have flaws. + +- Simply deleting the session cookie is fine as a convenience to the user, but means you can’t recover from XSS. Once an XSS is found you may wish to invalidate possibly stolen session cookies, but if sign-out is just “ask the browser to delete the cookie” then you can’t do it. +- Adding timestamps to session cookies and then setting a “last sign-out time” requires every action to check against the accounts database to discover if the user’s session is too old. This can slow things down, meaning developers will be tempted to optimise it out (it doesn’t seem dangerous to do so after all). But then if they remove the check for an endpoint of interest to attackers, you’ve still got the problem in step one. Additionally, this means signing out of one browser or device signs the user out of all of them, which isn’t expected behaviour. + +The right way to do this is keeping a list of invalidated session cookies with in-memory caching. But for most companies, there’s a less costly approach which is good enough: have the user’s sign-out link be just a way to clear the session cookie and nothing more, then make session cookies expire but be automatically and silently replaced every 5 minutes or so. The act of replacing an expired session cookie consults the database to see if the administrators have forced a logout. If the user is presenting an expired cookie then they are required to log in again. This recognises that cleaning up after cookies may have been stolen is a relatively rare event. + +### 10. Separate account emails from marketing mail + +![](https://cdn-images-1.medium.com/max/1600/1*kg2ZRHcCDGJ83rEz8D7saA.png) + +The obvious way to send password recovery links, signup verifications etc is simply from your company’s main email server. Unfortunately, some people in your company are trying to build a “relationship” with the user by sending them commercial mails they don’t want. + +Even if users agreed to receive these during account signup, many of them don’t want them anymore and some will solve this by reporting them as spam. This is an expedient solution for the savvy user who has noticed that simply clicking “Report spam” a couple of times makes the emails go away, without any mental effort expended on finding tiny light-grey-on-white-6pt-font unsubscribe links or … gasp … writing email filters. + +Unfortunately, this entirely normal behaviour will start to degrade the reputation of your mail email domain. Mail from your account system may start going into the user’s spam folder. We’ve all seen warnings on signup or password recovery flows telling us to check our spam folders — that’s why. + +One way to solve this is by buying a separate top-level domain to send your account mails from and making sure to configure DKIM. But then some users will notice the mismatch and report your email as phishing instead. The best solution is to send your marketing emails from a different DKIM domain, but that will likely involve picking a fight with your product folks. Still … the moment you chose to roll your own you accepted this pain, remember? + +### 11. Keep your password database well protected + +If you have passwords, you have a database attackers want (and frequently get). They don’t care about your company directly, they just want the passwords so they can try them at higher value targets. Yet data breaches are embarrassing and can lead to big penalties even if the direct impact on your customers is low. A database of OAuth tokens is of far less value to attackers and thus you’re much less likely to be attacked. + +### Conclusion + +There is far more I could write about account systems. Defending your site against malicious account hacking/signup is an entire book all by itself. I can’t write that book but you can watch [this video of a talk I gave in 2012](https://www.youtube.com/watch?v=XwsaZ4-3muA) instead, if you’re curious. + +But it’s fair to say the task is deceptively large. That’s why I keep recommending you bite the bullet and outsource your account management to the big boys. Fiddling with CAPTCHAs is not your core business. Writing design documents for “log out” is not your core business. Diagnosing why you’re bleeding users who forgot their password is not your core business. Diagnosing why SMS message delivery to Peru isn’t reliable isn’t your core business. Every dollar you spend on these things is a dollar your competitors who use “Sign in with …” are spending on their core business. + +So throw out your password database, and don’t look back. + + + --- + + > [掘金翻译计划](https://github.com/xitu/gold-miner) 是一个翻译优质互联网技术文章的社区,文章来源为 [掘金](https://juejin.im) 上的英文分享文章。内容覆盖 [Android](https://github.com/xitu/gold-miner#android)、[iOS](https://github.com/xitu/gold-miner#ios)、[React](https://github.com/xitu/gold-miner#react)、[前端](https://github.com/xitu/gold-miner#前端)、[后端](https://github.com/xitu/gold-miner#后端)、[产品](https://github.com/xitu/gold-miner#产品)、[设计](https://github.com/xitu/gold-miner#设计) 等领域,想要查看更多优质译文请持续关注 [掘金翻译计划](https://github.com/xitu/gold-miner)、[官方微博](http://weibo.com/juejinfanyi)、[知乎专栏](https://zhuanlan.zhihu.com/juejinfanyi)。 + \ No newline at end of file